Isolated human secreted proteins, nucleic acid molecules encoding human secreted proteins, and uses thereof

ABSTRACT

The present invention provides amino acid sequences of peptides that are encoded by genes within the human genome, the secreted peptides of the present invention. The present invention specifically provides isolated peptide and nucleic acid molecules, methods of identifying orthologs and paralogs of the secreted peptides, and methods of identifying modulators of the secreted peptides.

RELATED APPLICATIONS

[0001] The present application claims priority to provisional application U.S. Serial No. ______, filed Oct. 27, 2000 (Atty. Docket CL000841-PROV).

FIELD OF THE INVENTION

[0002] The present invention is in the field of secreted proteins that are related to the retinoic acid receptor responder secreted subfamily, recombinant DNA molecules, and protein production. The present invention specifically provides novel peptides and proteins that effect protein phosphorylation and nucleic acid molecules encoding such peptide and protein molecules, all of which are useful in the development of human therapeutics and diagnostic compositions and methods.

BACKGROUND OF THE INVENTION

[0003] Secreted Proteins

[0004] Many human proteins serve as pharmaceutically active compounds. Several classes of human proteins that serve as such active compounds include hormones, cytokines, cell growth factors, and cell differentiation factors. Most proteins that can be used as a pharmaceutically active compound fall within the family of secreted proteins. It is, therefore, important in developing new pharmaceutical compounds to identify secreted proteins that can be tested for activity in a variety of animal models. The present invention advances the state of the art by providing many novel human secreted proteins.

[0005] Secreted proteins are generally produced within cells at rough endoplasmic reticulum, are then exported to the golgi complex, and then move to secretory vesicles or granules, where they are secreted to the exterior of the cell via exocytosis.

[0006] Secreted proteins are particularly useful as diagnostic markers. Many secreted proteins are found, and can easily be measured, in serum. For example, a ‘signal sequence trap’ technique can often be utilized because many secreted proteins, such as certain secretory breast cancer proteins, contain a molecular signal sequence for cellular export. Additionally, antibodies against particular secreted serum proteins can serve as potential diagnostic agents, such as for diagnosing cancer.

[0007] Secreted proteins play a critical role in a wide array of important biological processes in humans and have numerous utilities; several illustrative examples are discussed herein. For example, fibroblast secreted proteins participate in extracellular matrix formation. Extracellular matrix affects growth factor action, cell adhesion, and cell growth. Structural and quantitative characteristics of fibroblast secreted proteins are modified during the course of cellular aging and such aging related modifications may lead to increased inhibition of cell adhesion, inhibited cell stimulation by growth factors, and inhibited cell proliferative ability (Eleftheriou et al., Mutat Res 1991 March-November;256(2-6): 127-38).

[0008] The secreted form of amyloid beta/A4 protein precursor (APP) functions as a growth and/or differentiation factor. The secreted form of APP can stimulate neurite extension of cultured neuroblastoma cells, presumably through binding to a cell surface receptor and thereby triggering intracellular transduction mechanisms. (Roch et al., Ann NY Acad Sci 1993 Sep. 24;695:149-57). Secreted APPs modulate neuronal excitability, counteract effects of glutamate on-growth cone behaviors, and increase synaptic complexity. The prominent effects of secreted APPs on synaptogenesis and neuronal survival suggest that secreted APPs play a major role in the process of natural cell death and, furthermore, may play a role in the development of a wide variety of neurological disorders, such as stroke, epilepsy, and Alzheimer's disease (Mattson et al., Perspect Dev Neurobiol 1998; 5(4):337-52).

[0009] Breast cancer cells secrete a 52K estrogen-regulated protein (see Rochefort et al., Ann NY Acad Sci 1986;464:190-201). This secreted protein is therefore useful in breast cancer diagnosis.

[0010] Two secreted proteins released by platelets, platelet factor 4 (PF4) and beta-thromboglobulin (betaTG), are accurate indicators of platelet involvement in hemostasis and thrombosis and assays that measure these secreted proteins are useful for studying the pathogenesis and course of thromboembolic disorders (Kaplan, Adv Exp Med Biol 1978; 102:105-19).

[0011] Vascular endothelial growth factor (VEGF) is another example of a naturally secreted protein. VEGF binds to cell-surface heparan sulfates, is generated by hypoxic endothelial cells, reduces apoptosis, and binds to high-affinity receptors that are up-regulated by hypoxia (Asahara et al., Semin Interv Cardiol 1996 September;1(3):225-32).

[0012] Many critical components of the immune system are secreted proteins, such as antibodies, and many important functions of the immune system are dependent upon the action of secreted proteins. For example, Saxon et al., Biochem Soc Trans 1997 May;25(2):383-7, discusses secreted IgE proteins.

[0013] For a further review of secreted proteins, see Nilsen-Hamilton et al., Cell Biol Int Rep 1982 September;6(9):815-36.

[0014] Retinoic Acids and Retinoic Acid Receptors

[0015] Retinoids, or vitamin A metabolites/derivatives, have been determined to play essential roles in many aspects of development, metabolism and reproduction in vertebrates (see, for example, The Retinoids, Second Edition, Sporn et al. (Raven Press, New York, 1994)). There are two classes of retinoid receptors: the retinoic acid receptors (RARs), which bind to both all-trans retinoic acid (atRA) and 9-cis retinoic acid (9cRA), and the retinoid X receptors (RXRs), which bind only to 9cRA. These receptors modulate ligand-dependent gene expression by interacting as RXR/RAR heterodimers or RXR homodimers on specific target gene DNA sequences known as hormone response elements. In addition to their role in retinoid signalling, RXRs also serve as heterodimeric partners of nuclear receptors for vitamin D, thyroid hormone, and peroxisome proliferators (reviewed by Mangelsdorf et al., at pages 319-349 of The Retinoids, Second Edition, Sporn et al. (Raven Press, New York, 1994)).

[0016] A number of studies have demonstrated that retinoids are essential for normal growth, vision, tissue homeostasis, reproduction and overall survival (for reviews and references, See Sporn et al., The Retinoids, Vols. 1 and 2, Sporn et al., eds., Academic Press, Orlando, Fla. (1984)). For example, retinoids have been shown to be vital to the maintenance of skin homeostasis and barrier function in mammals (Fisher, G. J., and Voorhees, J. J., FASEB J. 10:1002-1013 (1996)). Retinoids are also apparently crucial during embryogenesis, since offspring of dams with vitamin A deficiency (VAD) exhibit a number of developmental defects (Wilson, J. G., et al., Am. J. Anat. 92:189-217 (1953); Morriss-Kay, G. M., and Sokolova, N., FASEB J. 10:961-968 (1996)). With the exceptions of those on vision (Wald, G., et al., Science 162:230-239 (1968)) and spermatogenesis in mammals (van Pelt, H. M. M., and De Rooij, D. G., Endocrinology 128:697-704 (1991)), most of the effects generated by VAD in animals and their fetuses can be prevented and/or reversed by retinoic acid (RA) administration (Wilson, J. G., et al., Am. J. Anat. 92:189-217 (1953); Thompson et al., Proc. Royal Soc. 159:510-535 (1964); Morriss-Kay, G. M., and Sokolova, N., FASEB J. 10:961-968 (1996)). The dramatic teratogenic effects of maternal RA administration on mammalian embryos (Shenefelt, R. E., Teratology 5, 103-108 (1972); Kessel, M., Development 115:487-501 (1992); Creech Kraft, J., In Retinoids in Normal Development and Teratogenesis, G. M. Morriss-Kay, ed., Oxford University Press, Oxford, UK, pp. 267-280 (1992)), and the marked effects of topical administration of retinoids on embryonic development of vertebrates and limb regeneration in amphibians (Mohanty-Hejmadi et al., Nature 355:352-353 (1992); Tabin, C. J., Cell 66:199-217 (1991)), have contributed to the notion that RA may have critical roles in morphogenesis and organogenesis.

[0017] Many synthetic structural analogues of all-trans retinoic acid or 9-cis-retinoic acid, commonly termed “retinoids”, have been described in the literature to date. Some of these molecules are able to bind to, and specifically activate, the RARs or, on the other hand, the RXRs. Furthermore, some analogues are able to bind to, and activate a particular RAR receptor subtype (.alpha., .beta. or .gamma.). Finally, other analogues do not exhibit any particular selective activity with regard to these different receptors. In this respect, and by way of example, 9-cis-retinoic acid activates the RARs and the RXRs at one and the same time without any noteworthy selectivity for either of these receptors (nonspecific agonist ligand), whereas all-trans retinoic acid selectively activates the RARs (RAR-specific agonist ligand), with all subtypes being included. In a general manner, and qualitatively, a given substance (or ligand) is said to be specific for a given family of receptors (or, respectively, for a particular receptor of this family) when the said substance exhibits an affinity for all the receptors of this family (or, respectively, for the particular receptor of this family) which is stronger than that which it otherwise exhibits for all the receptors of any other family (or, respectively, for all the other receptors, of this same family or not).

[0018] The genetic activities of the RA signal are mediated through the two families of receptors—the RAR family and the RXR family—which belong to the superfamily of ligand-inducible transcriptional regulatory factors that include steroid/thyroid hormone and vitamin D3 receptors (for reviews see Leid et al., TIBS 17:427-433 (1992); Chambon, P., Semin. Cell Biol. 5:115-125 (1994); Chambon, P., FASEB J. 10:940-954 (1996); Giguere, V., Endocrinol. Rev. 15:61-79 (1994); Mangelsdorf, D. J., and Evans, R. M., Cell 83:841-850 (1995); Gronemeyer, H., and Laudet, V., Protein Profile 2:1173-1236 (1995)).

[0019] RARs are the critical factors in tissue differentiation and development. They are up-regulated in rapidly dividing cells and tumors. RARs play an important role in lymphocyte activation. Synthetic antagonists of retinoic acid receptors can inhibit delayed type hypersensitivity (DTH). Growth factors and carotene regulate RXR expression levels. For example, granulocyte macrophage colony-stimulating factor induces retinoic acid receptors in myeloid leukemia cells.

[0020] Retinoic acid receptors can form heterodimers with other nuclear receptors. The protein provided by the present invention can be used as a probe to detect possible interactions in the two-hybrid assay. Synthetic peptides that mimic dimerization surface can disrupt intermolecular interactions between these receptors. RAR gene rearrangements are the primary causes of some types of leukemia and provide a convenient genetic marker for malignant cell lines. A number of retinoic acid derivatives are used in treatment of myelodysplastic disorders. They are designed to bind and activate RXRs. Beta-carotene can prevent skin tumor formation in mouse models. N-(4-hydroxyphenyl)retinamide can delay onset of dysplasia in bronchi. Different chemopreventive drugs can be designed to target individual retinoic receptors. The sequences provided by the present invention may be used to design high affinity chemopreventive compounds.

[0021] Although both the RARs and RXRs respond to all-trans-retinoic acid in vivo, the receptors differ in several important aspects. First, the RARs and RXRs are significantly divergent in primary structure (e.g., the ligand binding domains of RAR.alpha. and RXR.alpha. have only 27% amino acid identity). These structural differences are reflected in the different relative degrees of responsiveness of RARs and RXRs to various vitamin A metabolites and synthetic retinoids. In addition, distinctly different patterns of tissue distribution are seen for RARs and RXRs. For example, in contrast to the RARs, which are generally not expressed at high levels in the visceral tissues, RXR.alpha. mRNA has been shown to be most abundant in the liver, kidney, lung, muscle and intestine. Finally, the RARs and RXRs have different target gene specificity. For example, response elements have recently been identified in the cellular retinal binding protein type II (CRBPII) and apolipoprotein AI genes which confer responsiveness to RXR, but not RAR. Furthermore, RAR has also been recently shown to repress RXR-mediated activation through the CRBPII RXR response element (Manglesdorf et al., Cell, 66:555-61 (1991)). These data indicate that two retinoic acid responsive pathways are not simply redundant, but instead manifest a complex interplay. Recently, Heyman et al. (Cell, 68:397-406 (1992)) and Levin et al. (Nature, 355:359-61 (1992)) independently demonstrated that 9-cis-retinoic acid is a natural endogenous ligand for the RXRs. 9-cis-retinoic acid was shown to bind and transactivate the RXRs, as well as the RARs, and therefore appears to act as a “bifunctional” ligand.

[0022] RAR Receptors

[0023] Receptors belonging to the RAR family (RAR.alpha., .beta. and .gamma. and their isoforms) are activated by both all-trans- and 9-cis-RA (Leid et al., TIBS 17:427-433 (1992); Chambon, P., Semin. Cell Biol. 5:115-125 (1994); Dolle, P., et al., Mech. Dev. 45:91-104 (1994); Chambon, P., FASEB J. 10:940-954 (1996)). Within a given species, the DNA binding (C) and the ligand binding (E) domains of the three RAR types are highly similar, whereas the C-terminal domain F and the middle domain D exhibit no or little similarity. The amino acid sequences of the three RAR types are also notably different in their B regions, and their main isoforms (.alpha.1 and .alpha.2, .beta.1 to .beta.4, and .gamma.1 and .gamma.2) further differ in their N-terminal A regions (Leid et al., TIBS 17:427-433 (1992)). Amino acid sequence comparisons have revealed that the interspecies conservation of a given RAR type is greater than the similarity found between the three RAR types within a given species (Leid et al., TIBS 17:427-433 (1992)). This interspecies conservation is particularly striking in the N-terminal A regions of the various RAR.alpha., .beta. and .gamma. isoforms, whose A region amino acid sequences are quite divergent. Taken together with the distinct spatio-temporal expression patterns observed for the transcripts of each RAR and RXR type in the developing embryo and in various adult mouse tissues (Zelent, A., et al., Nature 339:714-717 (1989); Dolle, P., et al., Nature 342:702-705 (1989); Dolle et al., Development 110:1133-1151 (1990); Ruberte et al., Development 108:213-222 (1990); Ruberte et al., Development 111:45-60 (1991); Mangelsdorf et al., Genes & Dev. 6:329-344 (1992)), this interspecies conservation has suggested that each RAR type (and isoform) may perform unique functions. This hypothesis is further supported by the finding that the various RAR isoforms contain two transcriptional activation functions (AFs) located in the N-terminal A/B region (AF-1) and in the C-terminal E region (AF-2), which can synergistically, and to some extent differentially, activate various RA-responsive promoters (Leid et al., TIBS 17:427-433 (1992); Nagpal, S., et al., Cell 70:1007-1019 (1992); Nagpal, S., et al., EMBO J. 12:2349-2360 (1993)).

[0024] RXR Receptors

[0025] Unlike the RARs, members of the retinoid X receptor family (RXR.alpha., .beta. and .gamma.) are activated exclusively by 9-cis-RA (Chambon, P., FASEB J. 10:940-954 (1996); Chambon, P., Semin. Cell Biol. 5:115-125 (1994); Dolle, P., et al., Mech. Dev. 45:91-104 (1994); Linney, E., Current Topics in Dev. Biol. 27:309-350 (1992); Leid et al., TIBS 17:427-433 (1992); Kastner et al., in Vitamin A in Health and Disease, R. Blomhoff, ed., Marcel Dekker, New York (1993)). However, the RXRs characterized to date are similar to the RARs in that the different RXR types also differ markedly in their N-terminal A/B regions (Leid et al., TIBS 17:427-433 (1992); Leid et al., Cell 68:377-395 (1992); Mangelsdorf et al., Genes and Dev. 6:329-344 (1992)), and contain the same transcriptional activation functions in their N-terminal A/B region and C-terminal E region (Leid et al., TIBS 17:427-433 (1992); Nagpal, S., et al., Cell 70:1007-1019 (1992); Nagpal, S., et al., EMBO J. 12:2349-2360 (1993)).

[0026] RXR.alpha. and RXR.beta. have a widespread (possibly ubiquitous) expression pattern during mouse development and in the adult animal, being found in all fetal and adult tissues thus far examined (Mangelsdorf, D. J., et al., Genes & Devel. 6:329-344 (1992); Dolle, P., et al., Mech. Devel. 45:91-104 (1994); Nagata, T., et al., Gene 142:183-189 (1994)). RXR.gamma. transcripts, however, appear to have a more restricted distribution, being expressed in developing skeletal muscle in the embryo (where their expression persists throughout life), in the heart (after birth), in sensory epithelia of the visual and auditory systems, in specific structures of the central nervous system, and in tissues involved in thyroid hormone homeostasis, e.g., the thyroid gland and thyrotrope cells in the pituitary (Mangelsdorf, D. J., et al., Genes & Devel. 6:329-344 (1992); Dolle, P., et al., Mech. Devel. 45:91-104 (1994); Sugawara, A., et al., Endocrinology 136:1766-1774 (1995); Liu, Q., and Linney, E., Mol. Endocrinol. 7:651-658 (1993)).

[0027] It is currently unclear whether all the molecular properties of RXRs characterized in vitro are relevant for their physiological functions in vivo. In particular, it is unknown under what conditions these receptors act as 9-cis-RA-dependent transcriptional regulators (Chambon, P., Semin. Cell Biol. 5:115-125 (1994)). The knock-outs of RXR.alpha. and RXR.beta. in the mouse have provided some insight into the physiological functions of these receptors. For example, the ocular and cardiac malformations observed in RXR.alpha.sup.−/− fetuses (Kastner, P., et al., Cell 78:987-1003 (1994); Sucov, H. M., et al., Genes & Devel. 8:1007-1018 (1994)) are similar to those found in the fetal VAD syndrome, thus suggesting an important function of RXR.alpha. in the transduction of a retinoid signal during development. The involvement of RXRs in retinoid signaling is further supported by studies of compound RXR.alpha./RAR mutants, which reveal defects that are either absent or less severe in the single mutants (Kastner, P., et al., Cell 78:987-1003 (1994); Kastner, P., et al., Cell 83:859-869 (1995)). Interestingly, however, knockout of RXR.gamma. in the mouse induces no overt deleterious effects, and RXR.gamma.sup.−/− homozygotes which are also RXR.alpha.sup.−/− or RXR.beta.sup.−/− exhibit no additional abnormalities beyond those seen in RXR.alpha.sup.−/−, RXR.beta.sup.−/− and fetal VAD syndrome fetuses (Krezel, W., et al., Proc. Natl. Acad. Sci. USA 93(17):9010-9014 (1996)), suggesting that RXR.gamma., despite its highly tissue-specific expression pattern in the developing embryo, is dispensable for embryonic development and postnatal life in the mouse. The observation that live-born RXR.gamma.sup.−/− /RXR.beta.sup.−/− /RXR.alpha.sup.−/−mutants can grow to reach adult age (Krezel et al., Proc. Natl. Acad. Sci. USA 93(17):9010-9014 (1996)) indicates that a single RXR.alpha. allele is sufficient to carry out all of the vital developmental and postnatal functions of the RXR family of receptors, particularly all of the developmental functions which depend on RARs and may require RXR partnership (Dolle, P., et al., Mech. Dev. 45:91-104 (1994); Kastner, P., et al., Cell 83:859-869 (1995)). Furthermore, the finding that RXR.alpha.sup.−/−/RXR.gamma.sup.−/− double mutant embryos are not more affected than are single RXR.alpha.sup.−/− mutants (Krezel et al., Proc. Natl. Acad. Sci. USA 93(17):9010-9014 (1996)) clearly shows that RXR.beta. alone can also perform some of these functions. Therefore, the fact that RXR.alpha. alone and, to a certain extent RXR.beta. alone, are sufficient for the completion of a number of developmental RXR functions, clearly indicates the existence of a large degree of functional redundancy amongst RXRs. In this respect, the RXR situation is different from that of RARs, since all of types of RAR double mutants displayed much broader sets of defects than single mutants (Rowe, A., et al., Develop. 111:771-778 (1991); Lohnes, D., et al., Develop. 120:2723-2748 (1994); Mendelsohn, C., Develop. 120:2749-2771 (1994)).

[0028] Retinoid Binding to RAR and RXR Receptors

[0029] The crystal structures of the ligand-binding domains (LBDs) of the RARs and RXRs have recently been elucidated (Bourget, W., et al., Nature 375:377-382 (1995); Renaud, J. P., et al., Nature 378:681-689 (1995); Wurtz, J. M., et al., Nature Struct. Biol. 3:87-94 (1996)). Among the various RAR types, substantial amino acid sequence identity is observed in these domains: comparison of the LBDs of RAR.alpha., RAR.beta. and RAR.gamma. indicates that only three amino acid residues are variable in the ligand-binding pocket of these receptors. These residues apparently account for the fact that the various RAR types exhibit some selectivity in binding certain synthetic retinoids (Chen, J.-Y., et al., EMBO J. 14(6):1187-1197 (1995); Renaud, J. P., et al., Nature 378:681-689 (1995)), and consideration of these divergent residues can be used to design RAR type-specific synthetic retinoids which may be agonistic or antagonistic (Chambon, P., FASEB J. 10:940-954 (1996)). This design approach may be extendable generally to other nuclear receptors, such as thyroid receptor .alpha. (Wagner, R. L., et al., Nature 378:690-697 (1995)), the ligand-binding pockets of which may chemically and structurally resemble those of the RARs (Chambon, P., FASEB J. 10:940-954 (1996)). Conversely, molecular modeling of the ligand-binding pocket of the RXRs demonstrates that there are no overt differences in amino acid composition between RXR.alpha., RXR.beta. and RXR.gamma. (Bourguet, W., et al., Nature 375:377-382 (1995); Wurtz, J. M., et al., Nature Struct. Biol. 3:87-94 (1996)), suggesting that design of type-specific synthetic ligands for the RXRs may be more difficult than for the RARs (Chambon, P., FASEB J. 10:940-954 (1996)).

[0030] Retinoid Signaling Through RAR:RXR Heterodimers

[0031] Nuclear receptors (NRs) are members of a superfamily of ligand-inducible transcriptional regulatory factors that include receptors for steroid hormones, thyroid hormones, vitamin D3 and retinoids (Leid, M., et al., Trends Biochem. Sci. 17:427-433 (1992); Leid, M., et al., Cell 68:377-395 (1992); and Linney, E. Curr. Top. Dev. Biol., 27:309-350 (1992)). NRs exhibit a modular structure which reflects the existence of several autonomous functional domains. Based on amino acid sequence similarity between the chicken estrogen receptor, the human estrogen and glucocorticoid receptors, and the v-erb-A oncogene (Krust, A., et al., EMBO J. 5:891-897 (1986)), defined six regions—A, B, C, D, E and F—which display different degrees of evolutionary conservation amongst various members of the nuclear receptor superfamily. The highly conserved region C contains two zinc fingers and corresponds to the core of the DNA-binding domain (DBD), which is responsible for specific recognition of the cognate response elements. Region E is functionally complex, since in addition to the ligand-binding domain (LBD), it contains a ligand-dependent activation function (AF-2) and a dimerization interface. An autonomous transcriptional activation function (AF-1) is present in the non-conserved N-terminal A/B regions of the steroid receptors. Interestingly, both AF-1 and AF-2 of steroid receptors exhibit differential transcriptional activation properties which appear to be both cell type and promoter context specific (Gronemeyer, H. Annu. Rev. Genet. 25:89-123 (1991)).

[0032] As described above, the all-trans (T-RA) and 9-cis (9C-RA) retinoic acid signals are transduced by two families of nuclear receptors, RAR .alpha., .beta. and .gamma. (and their isoforms) are activated by both T-RA and 9C-RA, whereas RXR .alpha., .beta. and .gamma. are exclusively activated by 9C-RA (Allenby, G. et al., Proc. Natl. Acad. Sci. USA 90:30-34 (1993)). The three RAR types differ in their B regions, and their main isoforms (.alpha.1 and .alpha.2, .beta.1-4, and .gamma.1 and .gamma.2) have different N-terminal A regions (Leid, M. et al., Trends Biochem. Sci. 17:427433 (1992)). Similarly, the RXR types differ in their A/B regions (Mangelsdorf, D. J. et al., Genes Dev. 6:329-344 (1992)).

[0033] The E-region of RARs and RXRs has also been shown to contain a dimerization interface (Yu, V. C. et al., Curr. Opin. Biotechnol. 3:597-602 (1992)). Most interestingly, it was demonstrated that RAR/RXR heterodimers bind much more efficiently in vitro than homodimers of either receptor to a number of RA response elements (RAREs, also known as retinoic acid receptor responders) (Yu, V. C. et al., Cell 67:1251-1266 (1991); Berrodin, T. J. et al., Mol. Endocrinol 6:1468-1478 (1992); Bugge, T. H. et al., EMBO J. 11:1409-1418 (1992); Hall, R. K. et al., Mol. Cell. Biol. 12: 5527-5535 (1992); Hallenbeck P. L. et al., Proc. Natl. Acad. Sci. USA 89:5572-5576 (1992); Husmann, M. et al., Biochem. Biophys. Res. Commun. 187:1558-1564 (1992); Kliewer, S. A. et al., Nature 355:446-449 (1992); Leid, M. et al., Cell 68:377-395 (1992); Marks, M. S. et al., EMBO J. 11:1419-1435 (1992); Zhang, X. K. et al., Nature 355:441-446 (1992)). RAR and RXR heterodimers are also preferentially formed in solution in vitro (Yu, V. C. et al., Cell 67:1251-1266 (1991); Leid, M. et al., Cell 68:377-395 (1992); Marks, M. S. et al., EMBO J. 11:1419-1435 (1992)), although the addition of 9C-RA appears to enhance the formation of RXR homodimers in vitro (Lehman, J. M. et al., Science 258:1944-1946 (1992); Zhang, X. K. et al., Nature 358:587-591 (1992b)).

[0034] It has been shown that activation of RA-responsive promoters likely occurs through RAR:RXR heterodimers rather than through homodimers (Yu, V. C. et al., Cell 67:1251-1266 (1991); Leid et al., Cell 68:377-395 (1992b); Durand et al., Cell 71:73-85 (1992); Nagpal et al., Cell 70:1007-1019 (1992); Zhang, X. K., et al., Nature 355, 441-446 (1992); Kliewer et al., Nature 355:446-449 (1992); Bugge et al., EMBO J. 11: 1409-1418 (1992); Marks et al., EMBO J. 11:1419-1435 (1992); Yu, V. C. et al., Cur. Op. Biotech. 3:597-602 (1992); Leid et al., TIBS 17:427-433 (1992); Laudet and Stehelin, Curr. Biol. 2:293-295 (1992); Green, S., Nature 361:590-591 (1993)). The RXR portion of these heterodimers has been proposed to be silent in retinoid-induced signaling (Kurokawa, R., et al., Nature 371:528-531 (1994); Forman, B. M., et al., Cell 81:541-550 (1995); Mangelsdorf, D. J., and Evans, R. M., Cell 83:835-850 (1995)), although conflicting results have been reported on this issue (Apfel, C. M., et al., J. Biol. Chem. 270(51):30765-30772 (1995); see Chambon, P., FASEB J. 10:940-954 (1996) for review). Although the results of these studies strongly suggest that RAR/RXR heterodimers are indeed functional units that transduce the RA signal in vivo, it is unclear whether all of the suggested heterodimeric combinations occur in vivo (Chambon, P., Semin. Cell Biol. 5:115-125 (1994)). Thus, the basis for the highly pleiotropic effect of retinoids may reside, at least in part, in the control of different subsets of retinoid-responsive promoters by cell-specifically expressed heterodimeric combinations of RAR:RXR types (and isoforms), whose activity may be in turn regulated by cell-specific levels of all-trans- and 9-cis-RA (Leid et al., TIBS 17:427-433 (1992)).

[0035] The RXR receptors may also be involved in RA-independent signaling. For example, the observation of aberrant lipid metabolism in the Sertoli cells of RXR.beta.sup.−/− mutant animals suggests that functional interactions may also occur between RXR.beta. and the peroxisomal proliferator-activated receptor signaling pathway (WO 94/26100; Kastner, P., et al., Genes & Devel. 10:80-92 (1996)).

[0036] For a further review of retinoic acid receptors, see: Shimizu et al., Cancer Res 2000 Aug. 15;60(16):4544-9; Ponnamperuma et al., Nutr Cancer 2000;37(1):82-8; Yoshimura et al., J Med Chem 2000 Jul. 27;43(15):2929-37; Kurie et al., Clin Cancer Res 2000 August;6(8):2973-9; Lee et al., J Biol Chem 2000 August 17; and Sainty et al., Blood 2000 Aug. 15;96(4):1287-96.

[0037] Secreted proteins, particularly members of the retinoic acid receptor responder secreted protein subfamily, are a major target for drug action and development. Accordingly, it is valuable to the field of pharmaceutical development to identify and characterize previously unknown members of this subfamily of secreted proteins. The present invention advances the state of the art by providing previously unidentified human secreted proteins that have homology to members of the retinoic acid receptor responder secreted protein subfamily.

SUMMARY OF THE INVENTION

[0038] The present invention is based in part on the identification of amino acid sequences of human secreted peptides and proteins that are related to the retinoic acid receptor responder secreted protein subfamily, as well as allelic variants and other mammalian orthologs thereof. These unique peptide sequences, and nucleic acid sequences that encode these peptides, can be used as models for the development of human therapeutic targets, aid in the identification of therapeutic proteins, and serve as targets for the development of human therapeutic agents that modulate secreted protein activity in cells and tissues that express the secreted protein.

DESCRIPTION OF THE FIGURE SHEETS

[0039]FIG. 1 provides the nucleotide sequence of a cDNA molecule or transcript sequence that encodes the secreted protein of the present invention. (SEQ ID NO:1) In addition, structure and functional information is provided, such as ATG start, stop and tissue distribution, where available, that allows one to readily determine specific uses of inventions based on this molecular sequence.

[0040]FIG. 2 provides the predicted amino acid sequence of the secreted protein of the present invention. (SEQ ID NO:2) In addition structure and functional information such as protein family, function, and modification sites is provided where available, allowing one to readily determine specific uses of inventions based on this molecular sequence.

[0041]FIG. 3 provides genomic sequences that span the gene encoding the secreted protein of the present invention. (SEQ ID NO:3) In addition structure and functional information, such as intron/exon structure, promoter location, etc., is provided where available, allowing one to readily determine specific uses of inventions based on this molecular sequence. As indicated by the data presented in FIG. 3, the map position was determined to be on chromosome 16.

DETAILED DESCRIPTION OF THE INVENTION

[0042] General Description

[0043] The present invention is based on the sequencing of the human genome. During the sequencing and assembly of the human genome, analysis of the sequence information revealed previously unidentified fragments of the human genome that encode peptides that share structural and/or sequence homology to protein/peptide/domains identified and characterized within the art as being a secreted protein or part of a secreted protein and are related to the retinoic acid receptor responder secreted protein subfamily. Utilizing these sequences, additional genomic sequences were assembled and transcript and/or cDNA sequences were isolated and characterized. Based on this analysis, the present invention provides amino acid sequences of human secreted peptides and proteins that are related to the retinoic acid receptor responder secreted protein subfamily, nucleic acid sequences in the form of transcript sequences, cDNA sequences and/or genomic sequences that encode these secreted peptides and proteins, nucleic acid variation (allelic information), tissue distribution of expression, and information about the closest art known protein/peptide/domain that has structural or sequence homology to the secreted protein of the present invention.

[0044] In addition to being previously unknown, the peptides that are provided in the present invention are selected based on their ability to be used for the development of commercially important products and services. Specifically, the present peptides are selected based on homology and/or structural relatedness to known secreted proteins of the retinoic acid receptor responder secreted protein subfamily and the expression pattern observed. The art has clearly established the commercial importance of members of this family of proteins and proteins that have expression patterns similar to that of the present gene. Some of the more specific features of the peptides of the present invention, and the uses thereof, are described herein, particularly in the Background of the Invention and in the annotation provided in the Figures, and/or are known within the art for each of the known retinoic acid receptor responder family or subfamily of secreted proteins.

[0045] Specific Embodiments

[0046] Peptide Molecules

[0047] The present invention provides nucleic acid sequences that encode protein molecules that have been identified as being members of the secreted protein family of proteins and are related to the retinoic acid receptor responder secreted protein subfamily (protein sequences are provided in FIG. 2, transcript/cDNA sequences are provided in FIG. 1 and genomic sequences are provided in FIG. 3). The peptide sequences provided in FIG. 2, as well as the obvious variants described herein, particularly allelic variants as identified herein and using the information in FIG. 3, will be referred herein as the secreted peptides of the present invention, secreted peptides, or peptides/proteins of the present invention.

[0048] The present invention provides isolated peptide and protein molecules that consist of, consist essentially of, or comprise the amino acid sequences of the secreted peptides disclosed in the FIG. 2, (encoded by the nucleic acid molecule shown in FIG. 1, transcript/cDNA or FIG. 3, genomic sequence), as well as all obvious variants of these peptides that are within the art to make and use. Some of these variants are described in detail below.

[0049] As used herein, a peptide is said to be “isolated” or “purified” when it is substantially free of cellular material or free of chemical precursors or other chemicals. The peptides of the present invention can be purified to homogeneity or other degrees of purity. The level of purification will be based on the intended use. The critical feature is that the preparation allows for the desired function of the peptide, even if in the presence of considerable amounts of other components (the features of an isolated nucleic acid molecule is discussed below).

[0050] In some uses, “substantially free of cellular material” includes preparations of the peptide having less than about 30% (by dry weight) other proteins (i.e., contaminating protein), less than about 20% other proteins, less than about 10% other proteins, or less than about 5% other proteins. When the peptide is recombinantly produced, it can also be substantially free of culture medium, i.e., culture medium represents less than about 20% of the volume of the protein preparation.

[0051] The language “substantially free of chemical precursors or other chemicals” includes preparations of the peptide in which it is separated from chemical precursors or other chemicals that are involved in its synthesis. In one embodiment, the language “substantially free of chemical precursors or other chemicals” includes preparations of the secreted peptide having less than about 30% (by dry weight) chemical precursors or other chemicals, less than about 20% chemical precursors or other chemicals, less than about 10% chemical precursors or other chemicals, or less than about 5% chemical precursors or other chemicals.

[0052] The isolated secreted peptide can be purified from cells that naturally express it, purified from cells that have been altered to express it (recombinant), or synthesized using known protein synthesis methods. For example, a nucleic acid molecule encoding the secreted peptide is cloned into an expression vector, the expression vector introduced into a host cell and the protein expressed in the host cell. The protein can then be isolated from the cells by an appropriate purification scheme using standard protein purification techniques. Many of these techniques are described in detail below.

[0053] Accordingly, the present invention provides proteins that consist of the amino acid sequences provided in FIG. 2 (SEQ ID NO:2), for example, proteins encoded by the transcript/cDNA nucleic acid sequences shown in FIG. 1 (SEQ ID NO:1) and the genomic sequences provided in FIG. 3 (SEQ ID NO:3). The amino acid sequence of such a protein is provided in FIG. 2. A protein consists of an amino acid sequence when the amino acid sequence is the final amino acid sequence of the protein.

[0054] The present invention further provides proteins that consist essentially of the amino acid sequences provided in FIG. 2 (SEQ ID NO:2), for example, proteins encoded by the transcript/cDNA nucleic acid sequences shown in FIG. 1 (SEQ ID NO:1) and the genomic sequences provided in FIG. 3 (SEQ ID NO:3). A protein consists essentially of an amino acid sequence when such an amino acid sequence is present with only a few additional amino acid residues, for example from about 1 to about 100 or so additional residues, typically from 1 to about 20 additional residues in the final protein.

[0055] The present invention further provides proteins that comprise the amino acid sequences provided in FIG. 2 (SEQ ID NO:2), for example, proteins encoded by the transcript/cDNA nucleic acid sequences shown in FIG. 1 (SEQ ID NO:1) and the genomic sequences provided in FIG. 3 (SEQ ID NO:3). A protein comprises an amino acid sequence when the amino acid sequence is at least part of the final amino acid sequence of the protein. In such a fashion, the protein can be only the peptide or have additional amino acid molecules, such as amino acid residues (contiguous encoded sequence) that are naturally associated with it or heterologous amino acid residues/peptide sequences. Such a protein can have a few additional amino acid residues or can comprise several hundred or more additional amino acids. The preferred classes of proteins that are comprised of the secreted peptides of the present invention are the naturally occurring mature proteins. A brief description of how various types of these proteins can be made/isolated is provided below.

[0056] The secreted peptides of the present invention can be attached to heterologous sequences to form chimeric or fusion proteins. Such chimeric and fusion proteins comprise a secreted peptide operatively linked to a heterologous protein having an amino acid sequence not substantially homologous to the secreted peptide. “Operatively linked” indicates that the secreted peptide and the heterologous protein are fused in-frame. The heterologous protein can be fused to the N-terminus or C-terminus of the secreted peptide.

[0057] In some uses, the fusion protein does not affect the activity of the secreted peptide per se. For example, the fusion protein can include, but is not limited to, enzymatic fusion proteins, for example beta-galactosidase fusions, yeast two-hybrid GAL fusions, poly-His fusions, MYC-tagged, HI-tagged and Ig fusions. Such fusion proteins, particularly poly-His fusions, can facilitate the purification of recombinant secreted peptide. In certain host cells (e.g., mammalian host cells), expression and/or secretion of a protein can be increased by using a heterologous signal sequence.

[0058] A chimeric or fusion protein can be produced by standard recombinant DNA techniques. For example, DNA fragments coding for the different protein sequences are ligated together in-frame in accordance with conventional techniques. In another embodiment, the fusion gene can be synthesized by conventional techniques including automated DNA synthesizers. Alternatively, PCR amplification of gene fragments can be carried out using anchor primers which give rise to complementary overhangs between two consecutive gene fragments which can subsequently be annealed and re-amplified to generate a chimeric gene sequence (see Ausubel et al., Current Protocols in Molecular Biology, 1992). Moreover, many expression vectors are commercially available that already encode a fusion moiety (e.g., a GST protein). A secreted peptide-encoding nucleic acid can be cloned into such an expression vector such that the fusion moiety is linked in-frame to the secreted peptide.

[0059] As mentioned above, the present invention also provides and enables obvious variants of the amino acid sequence of the proteins of the present invention, such as naturally occurring mature forms of the peptide, allelic/sequence variants of the peptides, non-naturally occurring recombinantly derived variants of the peptides, and orthologs and paralogs of the peptides. Such variants can readily be generated using art-known techniques in the fields of recombinant nucleic acid technology and protein biochemistry. It is understood, however, that variants exclude any amino acid sequences disclosed prior to the invention.

[0060] Such variants can readily be identified/made using molecular techniques and the sequence information disclosed herein. Further, such variants can readily be distinguished from other peptides based on sequence and/or structural homology to the secreted peptides of the present invention. The degree of homology/identity present will be based primarily on whether the peptide is a functional variant or non-functional variant, the amount of divergence present in the paralog family and the evolutionary distance between the orthologs.

[0061] To determine the percent identity of two amino acid sequences or two nucleic acid sequences, the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal alignment and non-homologous sequences can be disregarded for comparison purposes). In a preferred embodiment, at least 30%, 40%, 50%, 60%, 70%, 80%, or 90% or more of the length of a reference sequence is aligned for comparison purposes. The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, then the molecules are identical at that position (as used herein amino acid or nucleic acid “identity” is equivalent to amino acid or nucleic acid “homology”). The percent identity between the two sequences is a function of the number of identical positions shared by the sequences, taking into account the number of gaps, and the length of each gap, which need to be introduced for optimal alignment of the two sequences.

[0062] The comparison of sequences and determination of percent identity and similarity between two sequences can be accomplished using a mathematical algorithm. (Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part 1, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991). In a preferred embodiment, the percent identity between two amino acid sequences is determined using the Needleman and Wunsch (J. Mol. Biol. (48):444-453 (1970)) algorithm which has been incorporated into the GAP program in the GCG software package (available at http://www.gcg.com), using either a Blossom 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, the percent identity between two nucleotide sequences is determined using the GAP program in the GCG software package (Devereux, J., et al., Nucleic Acids Res. 12(1):387 (1984)) (available at http://www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. In another embodiment, the percent identity between two amino acid or nucleotide sequences is determined using the algorithm of E. Myers and W. Miller (CABIOS, 4:11-17 (1989)) which has been incorporated into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length penalty of 12 and a gap penalty of 4.

[0063] The nucleic acid and protein sequences of the present invention can further be used as a “query sequence” to perform a search against sequence databases to, for example, identify other family members or related sequences. Such searches can be performed using the NBLAST and XBLAST programs (version 2.0) of Altschul, et al. (J. Mol. Biol. 215:403-10 (1990)). BLAST nucleotide searches can be performed with the NBLAST program, score=100, wordlength=12 to obtain nucleotide sequences homologous to the nucleic acid molecules of the invention. BLAST protein searches can be performed with the XBLAST program, score=50, wordlength=3 to obtain amino acid sequences homologous to the proteins of the invention. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al. (Nucleic Acids Res. 25(17):3389-3402 (1997)). When utilizing BLAST and gapped BLAST programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used.

[0064] Full-length pre-processed forms, as well as mature processed forms, of proteins that comprise one of the peptides of the present invention can readily be identified as having complete sequence identity to one of the secreted peptides of the present invention as well as being encoded by the same genetic locus as the secreted peptide provided herein.

[0065] Allelic variants of a secreted peptide can readily be identified as being a human protein having a high degree (significant) of sequence homology/identity to at least a portion of the secreted peptide as well as being encoded by the same genetic locus as the secreted peptide provided herein. Genetic locus can readily be determined based on the genomic information provided in FIG. 3, such as the genomic sequence mapped to the reference human. As indicated by the data presented in FIG. 3, the map position was determined to be on chromosome 16. As used herein, two proteins (or a region of the proteins) have significant homology when the amino acid sequences are typically at least about 70-80%, 80-90%, and more typically at least about 90-95% or more homologous. A significantly homologous amino acid sequence, according to the present invention, will be encoded by a nucleic acid sequence that will hybridize to a secreted peptide encoding nucleic acid molecule under stringent conditions as more fully described below.

[0066]FIG. 3 provides SNP information that has been found in a gene encoding the transporter proteins of the present invention. The following variations were identified: T948−, G1149A, G4199C, T4352G, A6493G, T14047GA, G14136T, G14238A, T14260G, C25736T, G26321A, A31359G, T35098G, C40532G, A41706G, G51095C, G53101C, G54556C, −61872T, G62172C, A62860C, C67086A, A67621C, A70582T, C74175−, T74478C, T77092G, T77328C, G77385A, C77947T, G79395C, −81111C, and G81610A.

[0067] Paralogs of a secreted peptide can readily be identified as having some degree of significant sequence homology/identity to at least a portion of the secreted peptide, as being encoded by a gene from humans, and as having similar activity or function. Two proteins will typically be considered paralogs when the amino acid sequences are typically at least about 60% or greater, and more typically at least about 70% or greater homology through a given region or domain. Such paralogs will be encoded by a nucleic acid sequence that will hybridize to a secreted peptide encoding nucleic acid molecule under moderate to stringent conditions as more fully described below.

[0068] Orthologs of a secreted peptide can readily be identified as having some degree of significant sequence homology/identity to at least a portion of the secreted peptide as well as being encoded by a gene from another organism. Preferred orthologs will be isolated from mammals, preferably primates, for the development of human therapeutic targets and agents. Such orthologs will be encoded by a nucleic acid sequence that will hybridize to a secreted peptide encoding nucleic acid molecule under moderate to stringent conditions, as more fully described below, depending on the degree of relatedness of the two organisms yielding the proteins.

[0069] Non-naturally occurring variants of the secreted peptides of the present invention can readily be generated using recombinant techniques. Such variants include, but are not limited to deletions, additions and substitutions in the amino acid sequence of the secreted peptide. For example, one class of substitutions are conserved amino acid substitution. Such substitutions are those that substitute a given amino acid in a secreted peptide by another amino acid of like characteristics. Typically seen as conservative substitutions are the replacements, one for another, among the aliphatic amino acids Ala, Val, Leu, and Ile; interchange of the hydroxyl residues Ser and Thr; exchange of the acidic residues Asp and Glu; substitution between the amide residues Asn and Gin; exchange of the basic residues Lys and Arg; and replacements among the aromatic residues Phe and Tyr. Guidance concerning which amino acid changes are likely to be phenotypically silent are found in Bowie et al., Science 247:1306-1310 (1990).

[0070] Variant secreted peptides can be fully functional or can lack function in one or more activities, e.g. ability to bind substrate, ability to phosphorylate substrate, ability to mediate signaling, etc. Fully functional variants typically contain only conservative variation or variation in non-critical residues or in non-critical regions. FIG. 2 provides the result of protein analysis and can be used to identify critical domains/regions. Functional variants can also contain substitution of similar amino acids that result in no change or an insignificant change in function. Alternatively, such substitutions may positively or negatively affect function to some degree.

[0071] Non-functional variants typically contain one or more non-conservative amino acid substitutions, deletions, insertions, inversions, or truncation or a substitution, insertion, inversion, or deletion in a critical residue or critical region.

[0072] Amino acids that are essential for function can be identified by methods known in the art, such as site-directed mutagenesis or alanine-scanning mutagenesis (Cunningham et al., Science 244:1081-1085 (1989)), particularly using the results provided in FIG. 2. The latter procedure introduces single alanine mutations at every residue in the molecule. The resulting mutant molecules are then tested for biological activity such as secreted protein activity or in assays such as an in vitro proliferative activity. Sites that are critical for binding partner/substrate binding can also be determined by structural analysis such as crystallization, nuclear magnetic resonance or photoaffinity labeling (Smith et al., J. Mol. Biol. 224:899-904 (1992); de Vos et al. Science 255:306-312 (1992)).

[0073] The present invention further provides fragments of the secreted peptides, in addition to proteins and peptides that comprise and consist of such fragments, particularly those comprising the residues identified in FIG. 2. The fragments to which the invention pertains, however, are not to be construed as encompassing fragments that may be disclosed publicly prior to the present invention.

[0074] As used herein, a fragment comprises at least 8, 10, 12, 14, 16, or more contiguous amino acid residues from a secreted peptide. Such fragments can be chosen based on the ability to retain one or more of the biological activities of the secreted peptide or could be chosen for the ability to perform a function, e.g. bind a substrate or act as an immunogen. Particularly important fragments are biologically active fragments, peptides that are, for example, about 8 or more amino acids in length. Such fragments will typically comprise a domain or motif of the secreted peptide, e.g., active site or a substrate-binding domain. Further, possible fragments include, but are not limited to, domain or motif containing fragments, soluble peptide fragments, and fragments containing immunogenic structures. Predicted domains and functional sites are readily identifiable by computer programs well known and readily available to those of skill in the art (e.g., PROSITE analysis). The results of one such analysis are provided in FIG. 2.

[0075] Polypeptides often contain amino acids other than the 20 amino acids commonly referred to as the 20 naturally occurring amino acids. Further, many amino acids, including the terminal amino acids, may be modified by natural processes, such as processing and other post-translational modifications, or by chemical modification techniques well known in the art. Common modifications that occur naturally in secreted peptides are described in basic texts, detailed monographs, and the research literature, and they are well known to those of skill in the art (some of these features are identified in FIG. 2).

[0076] Known modifications include, but are not limited to, acetylation, acylation, ADP-ribosylation, amidation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond formation, demethylation, formation of covalent crosslinks, formation of cystine, formation of pyroglutamate, formylation, gamma carboxylation, glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination.

[0077] Such modifications are well known to those of skill in the art and have been described in great detail in the scientific literature. Several particularly common modifications, glycosylation, lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, hydroxylation and ADP-ribosylation, for instance, are described in most basic texts, such as Proteins—Structure and Molecular Properties, 2nd Ed., T. E. Creighton, W. H. Freeman and Company, New York (1993). Many detailed reviews are available on this subject, such as by Wold, F., Posttranslational Covalent Modification of Proteins, B. C. Johnson, Ed., Academic Press, New York 1-12 (1983); Seifter et al. (Meth. Enzymol. 182: 626-646 (1990)) and Rattan et al. (Ann. N.Y. Acad. Sci. 663:48-62 (1992)).

[0078] Accordingly, the secreted peptides of the present invention also encompass derivatives or analogs in which a substituted amino acid residue is not one encoded by the genetic code, in which a substituent group is included, in which the mature secreted peptide is fused with another compound, such as a compound to increase the half-life of the secreted peptide (for example, polyethylene glycol), or in which the additional amino acids are fused to the mature secreted peptide, such as a leader or secretory sequence or a sequence for purification of the mature secreted peptide or a pro-protein sequence.

[0079] Protein/Peptide Uses

[0080] The proteins of the present invention can be used in substantial and specific assays related to the functional information provided in the Figures; to raise antibodies or to elicit another immune response; as a reagent (including the labeled reagent) in assays designed to quantitatively determine levels of the protein (or its binding partner or ligand) in biological fluids; and as markers for tissues in which the corresponding protein is preferentially expressed (either constitutively or at a particular stage of tissue differentiation or development or in a disease state). Where the protein binds or potentially binds to another protein or ligand (such as, for example, in a secreted protein-effector protein interaction or secreted protein-ligand interaction), the protein can be used to identify the binding partner/ligand so as to develop a system to identify inhibitors of the binding interaction. Any or all of these uses are capable of being developed into reagent grade or kit format for commercialization as commercial products.

[0081] Methods for performing the uses listed above are well known to those skilled in the art. References disclosing such methods include “Molecular Cloning: A Laboratory Manual”, 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch and T. Maniatis eds., 1989, and “Methods in Enzymology: Guide to Molecular Cloning Techniques”, Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987.

[0082] The potential uses of the peptides of the present invention are based primarily on the source of the protein as well as the class/action of the protein. For example, secreted proteins isolated from humans and their human/mammalian orthologs serve as targets for identifying agents for use in mammalian therapeutic applications, e.g. a human drug, particularly in modulating a biological or pathological response in a cell or tissue that expresses the secreted protein. A large percentage of pharmaceutical agents are being developed that modulate the activity of secreted proteins, particularly members of the retinoic acid receptor responder subfamily (see Background of the Invention). The structural and functional information provided in the Background and Figures provide specific and substantial uses for the molecules of the present invention, particularly in combination with the expression information provided in FIG. 1. Such uses can readily be determined using the information provided herein, that which is known in the art, and routine experimentation.

[0083] The proteins of the present invention (including variants and fragments that may have been disclosed prior to the present invention) are useful for biological assays related to secreted proteins that are related to members of the retinoic acid receptor responder subfamily. Such assays involve any of the known secreted protein functions or activities or properties useful for diagnosis and treatment of secreted protein-related conditions that are specific for the subfamily of secreted proteins that the one of the present invention belongs to, particularly in cells and tissues that express the secreted protein.

[0084] The proteins of the present invention are also useful in drug screening assays, in cell-based or cell-free systems. Cell-based systems can be native, i.e., cells that normally express the secreted protein, as a biopsy or expanded in cell culture. In an alternate embodiment, cell-based assays involve recombinant host cells expressing the secreted protein.

[0085] The polypeptides can be used to identify compounds that modulate secreted protein activity of the protein in its natural state or an altered form that causes a specific disease or pathology associated with the secreted protein. Both the secreted proteins of the present invention and appropriate variants and fragments can be used in high-throughput screens to assay candidate compounds for the ability to bind to the secreted protein. These compounds can be further screened against a functional secreted protein to determine the effect of the compound on the secreted protein activity. Further, these compounds can be tested in animal or invertebrate systems to determine activity/effectiveness. Compounds can be identified that activate (agonist) or inactivate (antagonist) the secreted protein to a desired degree.

[0086] Further, the proteins of the present invention can be used to screen a compound for the ability to stimulate or inhibit interaction between the secreted protein and a molecule that normally interacts with the secreted protein, e.g. a substrate or a component of the signal pathway that the secreted protein normally interacts (for example, another secreted protein). Such assays typically include the steps of combining the secreted protein with a candidate compound under conditions that allow the secreted protein, or fragment, to interact with the target molecule, and to detect the formation of a complex between the protein and the target or to detect the biochemical consequence of the interaction with the secreted protein and the target.

[0087] Candidate compounds include, for example, 1) peptides such as soluble peptides, including Ig-tailed fusion peptides and members of random peptide libraries (see, e.g., Lam et al., Nature 354:82-84 (1991); Houghten et al., Nature 354:84-86 (1991)) and combinatorial chemistry-derived molecular libraries made of D- and/or L-configuration amino acids; 2) phosphopeptides (e.g., members of random and partially degenerate, directed phosphopeptide libraries, see, e.g., Songyang et al., Cell 72:767-778 (1993)); 3) antibodies (e.g., polyclonal, monoclonal, humanized, anti-idiotypic, chimeric, and single chain antibodies as well as Fab, F(ab′)₂, Fab expression library fragments, and epitope-binding fragments of antibodies); and 4) small organic and inorganic molecules (e.g., molecules obtained from combinatorial and natural product libraries).

[0088] One candidate compound is a soluble fragment of the receptor that competes for substrate binding. Other candidate compounds include mutant secreted proteins or appropriate fragments containing mutations that affect secreted protein function and thus compete for substrate. Accordingly, a fragment that competes for substrate, for example with a higher affinity, or a fragment that binds substrate but does not allow release, is encompassed by the invention.

[0089] Any of the biological or biochemical functions mediated by the secreted protein can be used as an endpoint assay. These include all of the biochemical or biochemical/biological events described herein, in the references cited herein, incorporated by reference for these endpoint assay targets, and other functions known to those of ordinary skill in the art or that can be readily identified using the information provided in the Figures, particularly FIG. 2. Specifically, a biological function of a cell or tissues that expresses the secreted protein can be assayed.

[0090] Binding and/or activating compounds can also be screened by using chimeric secreted proteins in which the amino terminal extracellular domain, or parts thereof, the entire transmembrane domain or subregions, such as any of the seven transmembrane segments or any of the intracellular or extracellular loops and the carboxy terminal intracellular domain, or parts thereof, can be replaced by heterologous domains or subregions. For example, a substrate-binding region can be used that interacts with a different substrate then that which is recognized by the native secreted protein. Accordingly, a different set of signal transduction components is available as an end-point assay for activation. This allows for assays to be performed in other than the specific host cell from which the secreted protein is derived.

[0091] The proteins of the present invention are also useful in competition binding assays in methods designed to discover compounds that interact with the secreted protein (e.g. binding partners and/or ligands). Thus, a compound is exposed to a secreted protein polypeptide under conditions that allow the compound to bind or to otherwise interact with the polypeptide. Soluble secreted protein polypeptide is also added to the mixture. If the test compound interacts with the soluble secreted protein polypeptide, it decreases the amount of complex formed or activity from the secreted protein target. This type of assay is particularly useful in cases in which compounds are sought that interact with specific regions of the secreted protein. Thus, the soluble polypeptide that competes with the target secreted protein region is designed to contain peptide sequences corresponding to the region of interest.

[0092] To perform cell free drug screening assays, it is sometimes desirable to immobilize either the secreted protein, or fragment, or its target molecule to facilitate separation of complexes from uncomplexed forms of one or both of the proteins, as well as to accommodate automation of the assay.

[0093] Techniques for immobilizing proteins on matrices can be used in the drug screening assays. In one embodiment, a fusion protein can be provided which adds a domain that allows the protein to be bound to a matrix. For example, glutathione-S-transferase fusion proteins can be adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) or glutathione derivatized microtitre plates, which are then combined with the cell lysates (e.g., ³⁵S-labeled) and the candidate compound, and the mixture incubated under conditions conducive to complex formation (e.g., at physiological conditions for salt and pH). Following incubation, the beads are washed to remove any unbound label, and the matrix immobilized and radiolabel determined directly, or in the supernatant after the complexes are dissociated. Alternatively, the complexes can be dissociated from the matrix, separated by SDS-PAGE, and the level of secreted protein-binding protein found in the bead fraction quantitated from the gel using standard electrophoretic techniques. For example, either the polypeptide or its target molecule can be immobilized utilizing conjugation of biotin and streptavidin using techniques well known in the art. Alternatively, antibodies reactive with the protein but which do not interfere with binding of the protein to its target molecule can be derivatized to the wells of the plate, and the protein trapped in the wells by antibody conjugation. Preparations of a secreted protein-binding protein and a candidate compound are incubated in the secreted protein-presenting wells and the amount of complex trapped in the well can be quantitated. Methods for detecting such complexes, in addition to those described above for the GST-immobilized complexes, include immunodetection of complexes using antibodies reactive with the secreted protein target molecule, or which are reactive with secreted protein and compete with the target molecule, as well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the target molecule.

[0094] Agents that modulate one of the secreted proteins of the present invention can be identified using one or more of the above assays, alone or in combination. It is generally preferable to use a cell-based or cell free system first and then confirm activity in an animal or other model system. Such model systems are well known in the art and can readily be employed in this context.

[0095] Modulators of secreted protein activity identified according to these drug screening assays can be used to treat a subject with a disorder mediated by the secreted protein pathway, by treating cells or tissues that express the secreted protein. These methods of treatment include the steps of administering a modulator of secreted protein activity in a pharmaceutical composition to a subject in need of such treatment, the modulator being identified as described herein.

[0096] In yet another aspect of the invention, the secreted proteins can be used as “bait proteins” in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell 72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartel et al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693-1696; and Brent WO94/10300), to identify other proteins, which bind to or interact with the secreted protein and are involved in secreted protein activity.

[0097] The two-hybrid system is based on the modular nature of most transcription factors, which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two different DNA constructs. In one construct, the gene that codes for a secreted protein is fused to a gene encoding the DNA binding domain of a known transcription factor (e.g., GAL4). In the other construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified protein (“prey” or “sample”) is fused to a gene that codes for the activation domain of the known transcription factor. If the “bait” and the “prey” proteins are able to interact, in vivo, forming a secreted protein-dependent complex, the DNA-binding and activation domains of the transcription factor are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., LacZ) which is operably linked to a transcriptional regulatory site responsive to the transcription factor. Expression of the reporter gene can be detected and cell colonies containing the functional transcription factor can be isolated and used to obtain the cloned gene which encodes the protein which interacts with the secreted protein.

[0098] This invention further pertains to novel agents identified by the above-described screening assays. Accordingly, it is within the scope of this invention to further use an agent identified as described herein in an appropriate animal model. For example, an agent identified as described herein (e.g., a secreted protein-modulating agent, an antisense secreted protein nucleic acid molecule, a secreted protein-specific antibody, or a secreted protein-binding partner) can be used in an animal or other model to determine the efficacy, toxicity, or side effects of treatment with such an agent. Alternatively, an agent identified as described herein can be used in an animal or other model to determine the mechanism of action of such an agent. Furthermore, this invention pertains to uses of novel agents identified by the above-described screening assays for treatments as described herein.

[0099] The secreted proteins of the present invention are also useful to provide a target for diagnosing a disease or predisposition to disease mediated by the peptide. Accordingly, the invention provides methods for detecting the presence, or levels of, the protein (or encoding mRNA) in a cell, tissue, or organism. The method involves contacting a biological sample with a compound capable of interacting with the secreted protein such that the interaction can be detected. Such an assay can be provided in a single detection format or a multi-detection format such as an antibody chip array.

[0100] One agent for detecting a protein in a sample is an antibody capable of selectively binding to protein. A biological sample includes tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids present within a subject.

[0101] The peptides of the present invention also provide targets for diagnosing active protein activity, disease, or predisposition to disease, in a patient having a variant peptide, particularly activities and conditions that are known for other members of the family of proteins to which the present one belongs. Thus, the peptide can be isolated from a biological sample and assayed for the presence of a genetic mutation that results in aberrant peptide. This includes amino acid substitution, deletion, insertion, rearrangement, (as the result of aberrant splicing events), and inappropriate post-translational modification. Analytic methods include altered electrophoretic mobility, altered tryptic peptide digest, altered secreted protein activity in cell-based or cell-free assay, alteration in substrate or antibody-binding pattern, altered isoelectric point, direct amino acid sequencing, and any other of the known assay techniques useful for detecting mutations in a protein. Such an assay can be provided in a single detection format or a multi-detection format such as an antibody chip array.

[0102] In vitro techniques for detection of peptide include enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations and immunofluorescence using a detection reagent, such as an antibody or protein binding agent. Alternatively, the peptide can be detected in vivo in a subject by introducing into the subject a labeled anti-peptide antibody or other types of detection agent. For example, the antibody can be labeled with a radioactive marker whose presence and location in a subject can be detected by standard imaging techniques. Particularly useful are methods that detect the allelic variant of a peptide expressed in a subject and methods which detect fragments of a peptide in a sample.

[0103] The peptides are also useful in pharmacogenomic analysis. Pharmacogenomics deal with clinically significant hereditary variations in the response to drugs due to altered drug disposition and abnormal action in affected persons. See, e.g., Eichelbaum, M. (Clin. Exp. Pharmacol. Physiol. 23(10-11):983-985 (1996)), and Linder, M. W. (Clin. Chem. 43(2):254-266 (1997)). The clinical outcomes of these variations result in severe toxicity of therapeutic drugs in certain individuals or therapeutic failure of drugs in certain individuals as a result of individual variation in metabolism. Thus, the genotype of the individual can determine the way a therapeutic compound acts on the body or the way the body metabolizes the compound. Further, the activity of drug metabolizing enzymes effects both the intensity and duration of drug action. Thus, the pharmacogenomics of the individual permit the selection of effective compounds and effective dosages of such compounds for prophylactic or therapeutic treatment based on the individual's genotype. The discovery of genetic polymorphisms in some drug metabolizing enzymes has explained why some patients do not obtain the expected drug effects, show an exaggerated drug effect, or experience serious toxicity from standard drug dosages. Polymorphisms can be expressed in the phenotype of the extensive metabolizer and the phenotype of the poor metabolizer. Accordingly, genetic polymorphism may lead to allelic protein variants of the secreted protein in which one or more of the secreted protein functions in one population is different from those in another population. The peptides thus allow a target to ascertain a genetic predisposition that can affect treatment modality. Thus, in a ligand-based treatment, polymorphism may give rise to amino terminal extracellular domains and/or other substrate-binding regions that are more or less active in substrate binding, and secreted protein activation. Accordingly, substrate dosage would necessarily be modified to maximize the therapeutic effect within a given population containing a polymorphism. As an alternative to genotyping, specific polymorphic peptides could be identified.

[0104] The peptides are also useful for treating a disorder characterized by an absence of, inappropriate, or unwanted expression of the protein. Accordingly, methods for treatment include the use of the secreted protein or fragments.

[0105] Antibodies

[0106] The invention also provides antibodies that selectively bind to one of the peptides of the present invention, a protein comprising such a peptide, as well as variants and fragments thereof. As used herein, an antibody selectively binds a target peptide when it binds the target peptide and does not significantly bind to unrelated proteins. An antibody is still considered to selectively bind a peptide even if it also binds to other proteins that are not substantially homologous with the target peptide so long as such proteins share homology with a fragment or domain of the peptide target of the antibody. In this case, it would be understood that antibody binding to the peptide is still selective despite some degree of cross-reactivity.

[0107] As used herein, an antibody is defined in terms consistent with that recognized within the art: they are multi-subunit proteins produced by a mammalian organism in response to an antigen challenge. The antibodies of the present invention include polyclonal antibodies and monoclonal antibodies, as well as fragments of such antibodies, including, but not limited to, Fab or F(ab′)₂, and Fv fragments.

[0108] Many methods are known for generating and/or identifying antibodies to a given target peptide. Several such methods are described by Harlow, Antibodies, Cold Spring Harbor Press, (1989).

[0109] In general, to generate antibodies, an isolated peptide is used as an immunogen and is administered to a mammalian organism, such as a rat, rabbit or mouse. The full-length protein, an antigenic peptide fragment or a fusion protein can be used. Particularly important fragments are those covering functional domains, such as the domains identified in FIG. 2, and domain of sequence homology or divergence amongst the family, such as those that can readily be identified using protein alignment methods and as presented in the Figures.

[0110] Antibodies are preferably prepared from regions or discrete fragments of the secreted proteins. Antibodies can be prepared from any region of the peptide as described herein. However, preferred regions will include those involved in function/activity and/or secreted protein/binding partner interaction. FIG. 2 can be used to identify particularly important regions while sequence alignment can be used to identify conserved and unique sequence fragments.

[0111] An antigenic fragment will typically comprise at least 8 contiguous amino acid residues. The antigenic peptide can comprise, however, at least 10, 12, 14, 16 or more amino acid residues. Such fragments can be selected on a physical property, such as fragments correspond to regions that are located on the surface of the protein, e.g., hydrophilic regions or can be selected based on sequence uniqueness (see FIG. 2).

[0112] Detection on an antibody of the present invention can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance. Examples of detectable substances include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; an example of a luminescent material includes luminol; examples of bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or ³H.

[0113] Antibody Uses

[0114] The antibodies can be used to isolate one of the proteins of the present invention by standard techniques, such as affinity chromatography or immunoprecipitation. The antibodies can facilitate the purification of the natural protein from cells and recombinantly produced protein expressed in host cells. In addition, such antibodies are useful to detect the presence of one of the proteins of the present invention in cells or tissues to determine the pattern of expression of the protein among various tissues in an organism and over the course of normal development. Further, such antibodies can be used to detect protein in situ, in vitro, or in a cell lysate or supernatant in order to evaluate the abundance and pattern of expression. Also, such antibodies can be used to assess abnormal tissue distribution or abnormal expression during development or progression of a biological condition. Antibody detection of circulating fragments of the full length protein can be used to identify turnover.

[0115] Further, the antibodies can be used to assess expression in disease states such as in active stages of the disease or in an individual with a predisposition toward disease related to the protein's function. When a disorder is caused by an inappropriate tissue distribution, developmental expression, level of expression of the protein, or expressed/processed form, the antibody can be prepared against the normal protein. If a disorder is characterized by a specific mutation in the protein, antibodies specific for this mutant protein can be used to assay for the presence of the specific mutant protein.

[0116] The antibodies can also be used to assess normal and aberrant subcellular localization of cells in the various tissues in an organism. The diagnostic uses can be applied, not only in genetic testing, but also in monitoring a treatment modality. Accordingly, where treatment is ultimately aimed at correcting expression level or the presence of aberrant sequence and aberrant tissue distribution or developmental expression, antibodies directed against the protein or relevant fragments can be used to monitor therapeutic efficacy.

[0117] Additionally, antibodies are useful in pharmacogenomic analysis. Thus, antibodies prepared against polymorphic proteins can be used to identify individuals that require modified treatment modalities. The antibodies are also useful as diagnostic tools as an immunological marker for aberrant protein analyzed by electrophoretic mobility, isoelectric point, tryptic peptide digest, and other physical assays known to those in the art.

[0118] The antibodies are also useful for tissue typing. Thus, where a specific protein has been correlated with expression in a specific tissue, antibodies that are specific for this protein can be used to identify a tissue type.

[0119] The antibodies are also useful for inhibiting protein function, for example, blocking the binding of the secreted peptide to a binding partner such as a substrate. These uses can also be applied in a therapeutic context in which treatment involves inhibiting the protein's function. An antibody can be used, for example, to block binding, thus modulating (agonizing or antagonizing) the peptides activity. Antibodies can be prepared against specific fragments containing sites required for function or against intact protein that is associated with a cell or cell membrane. See FIG. 2 for structural information relating to the proteins of the present invention.

[0120] The invention also encompasses kits for using antibodies to detect the presence of a protein in a biological sample. The kit can comprise antibodies such as a labeled or labelable antibody and a compound or agent for detecting protein in a biological sample; means for determining the amount of protein in the sample; means for comparing the amount of protein in the sample with a standard; and instructions for use. Such a kit can be supplied to detect a single protein or epitope or can be configured to detect one of a multitude of epitopes, such as in an antibody detection array. Arrays are described in detail below for nuleic acid arrays and similar methods have been developed for antibody arrays.

[0121] Nucleic Acid Molecules

[0122] The present invention further provides isolated nucleic acid molecules that encode a secreted peptide or protein of the present invention (cDNA, transcript and genomic sequence). Such nucleic acid molecules will consist of, consist essentially of, or comprise a nucleotide sequence that encodes one of the secreted peptides of the present invention, an allelic variant thereof, or an ortholog or paralog thereof.

[0123] As used herein, an “isolated” nucleic acid molecule is one that is separated from other nucleic acid present in the natural source of the nucleic acid. Preferably, an “isolated” nucleic acid is free of sequences which naturally flank the nucleic acid (i.e., sequences located at the 5′ and 3′ ends of the nucleic acid) in the genomic DNA of the organism from which the nucleic acid is derived. However, there can be some flanking nucleotide sequences, for example up to about 5 KB, 4 KB, 3 KB, 2 KB, or 1 KB or less, particularly contiguous peptide encoding sequences and peptide encoding sequences within the same gene but separated by introns in the genomic sequence. The important point is that the nucleic acid is isolated from remote and unimportant flanking sequences such that it can be subjected to the specific manipulations described herein such as recombinant expression, preparation of probes and primers, and other uses specific to the nucleic acid sequences.

[0124] Moreover, an “isolated” nucleic acid molecule, such as a transcript/cDNA molecule, can be substantially free of other cellular material, or culture medium when produced by recombinant techniques, or chemical precursors or other chemicals when chemically synthesized. However, the nucleic acid molecule can be fused to other coding or regulatory sequences and still be considered isolated.

[0125] For example, recombinant DNA molecules contained in a vector are considered isolated. Further examples of isolated DNA molecules include recombinant DNA molecules maintained in heterologous host cells or purified (partially or substantially) DNA molecules in solution. Isolated RNA molecules include in vivo or in vitro RNA transcripts of the isolated DNA molecules of the present invention. Isolated nucleic acid molecules according to the present invention further include such molecules produced synthetically.

[0126] Accordingly, the present invention provides nucleic acid molecules that consist of the nucleotide sequence shown in FIG. 1 or 3 (SEQ ID NO:1, transcript sequence and SEQ ID NO:3, genomic sequence), or any nucleic acid molecule that encodes the protein provided in FIG. 2, SEQ ID NO:2. A nucleic acid molecule consists of a nucleotide sequence when the nucleotide sequence is the complete nucleotide sequence of the nucleic acid molecule.

[0127] The present invention further provides nucleic acid molecules that consist essentially of the nucleotide sequence shown in FIG. 1 or 3 (SEQ ID NO:1, transcript sequence and SEQ ID NO:3, genomic sequence), or any nucleic acid molecule that encodes the protein provided in FIG. 2, SEQ ID NO:2. A nucleic acid molecule consists essentially of a nucleotide sequence when such a nucleotide sequence is present with only a few additional nucleic acid residues in the final nucleic acid molecule.

[0128] The present invention further provides nucleic acid molecules that comprise the nucleotide sequences shown in FIG. 1 or 3 (SEQ ID NO:1, transcript sequence and SEQ ID NO:3, genomic sequence), or any nucleic acid molecule that encodes the protein provided in FIG. 2, SEQ ID NO:2. A nucleic acid molecule comprises a nucleotide sequence when the nucleotide sequence is at least part of the final nucleotide sequence of the nucleic acid molecule. In such a fashion, the nucleic acid molecule can be only the nucleotide sequence or have additional nucleic acid residues, such as nucleic acid residues that are naturally associated with it or heterologous nucleotide sequences. Such a nucleic acid molecule can have a few additional nucleotides or can comprises several hundred or more additional nucleotides. A brief description of how various types of these nucleic acid molecules can be readily made/isolated is provided below.

[0129] In FIGS. 1 and 3, both coding and non-coding sequences are provided. Because of the source of the present invention, humans genomic sequence (FIG. 3) and cDNA/transcript sequences (FIG. 1), the nucleic acid molecules in the Figures will contain genomic intronic sequences, 5′ and 3′ non-coding sequences, gene regulatory regions and non-coding intergenic sequences. In general such sequence features are either noted in FIGS. 1 and 3 or can readily be identified using computational tools known in the art. As discussed below, some of the non-coding regions, particularly gene regulatory elements such as promoters, are useful for a variety of purposes, e.g. control of heterologous gene expression, target for identifying gene activity modulating compounds, and are particularly claimed as fragments of the genomic sequence provided herein.

[0130] The isolated nucleic acid molecules can encode the mature protein plus additional amino or carboxyl-terminal amino acids, or amino acids interior to the mature peptide (when the mature form has more than one peptide chain, for instance). Such sequences may play a role in processing of a protein from precursor to a mature form, facilitate protein trafficking, prolong or shorten protein half-life or facilitate manipulation of a protein for assay or production, among other things. As generally is the case in situ, the additional amino acids may be processed away from the mature protein by cellular enzymes.

[0131] As mentioned above, the isolated nucleic acid molecules include, but are not limited to, the sequence encoding the secreted peptide alone, the sequence encoding the mature peptide and additional coding sequences, such as a leader or secretory sequence (e.g., a pre-pro or pro-protein sequence), the sequence encoding the mature peptide, with or without the additional coding sequences, plus additional non-coding sequences, for example introns and non-coding 5′ and 3′ sequences such as transcribed but non-translated sequences that play a role in transcription, mRNA processing (including splicing and polyadenylation signals), ribosome binding and stability of mRNA. In addition, the nucleic acid molecule may be fused to a marker sequence encoding, for example, a peptide that facilitates purification.

[0132] Isolated nucleic acid molecules can be in the form of RNA, such as mRNA, or in the form DNA, including cDNA and genomic DNA obtained by cloning or produced by chemical synthetic techniques or by a combination thereof. The nucleic acid, especially DNA, can be double-stranded or single-stranded. Single-stranded nucleic acid can be the coding strand (sense strand) or the non-coding strand (anti-sense strand).

[0133] The invention further provides nucleic acid molecules that encode fragments of the peptides of the present invention as well as nucleic acid molecules that encode obvious variants of the secreted proteins of the present invention that are described above. Such nucleic acid molecules may be naturally occurring, such as allelic variants (same locus), paralogs (different locus), and orthologs (different organism), or may be constructed by recombinant DNA methods or by chemical synthesis. Such non-naturally occurring variants may be made by mutagenesis techniques, including those applied to nucleic acid molecules, cells, or organisms. Accordingly, as discussed above, the variants can contain nucleotide substitutions, deletions, inversions and insertions. Variation can occur in either or both the coding and non-coding regions. The variations can produce both conservative and non-conservative amino acid substitutions.

[0134] The present invention further provides non-coding fragments of the nucleic acid molecules provided in FIGS. 1 and 3. Preferred non-coding fragments include, but are not limited to, promoter sequences, enhancer sequences, gene modulating sequences and gene termination sequences. Such fragments are useful in controlling heterologous gene expression and in developing screens to identify gene-modulating agents. A promoter can readily be identified as being 5′ to the ATG start site in the genomic sequence provided in FIG. 3.

[0135] A fragment comprises a contiguous nucleotide sequence greater than 12 or more nucleotides. Further, a fragment could at least 30, 40, 50, 100, 250 or 500 nucleotides in length. The length of the fragment will be based on its intended use. For example, the fragment can encode epitope bearing regions of the peptide, or can be useful as DNA probes and primers. Such fragments can be isolated using the known nucleotide sequence to synthesize an oligonucleotide probe. A labeled probe can then be used to screen a cDNA library, genomic DNA library, or mRNA to isolate nucleic acid corresponding to the coding region. Further, primers can be used in PCR reactions to clone specific regions of gene.

[0136] A probe/primer typically comprises substantially a purified oligonucleotide or oligonucleotide pair. The oligonucleotide typically comprises a region of nucleotide sequence that hybridizes under stringent conditions to at least about 12, 20, 25, 40, 50 or more consecutive nucleotides.

[0137] Orthologs, homologs, and allelic variants can be identified using methods well known in the art. As described in the Peptide Section, these variants comprise a nucleotide sequence encoding a peptide that is typically 60-70%, 70-80%, 80-90%, and more typically at least about 90-95% or more homologous to the nucleotide sequence shown in the Figure sheets or a fragment of this sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under moderate to stringent conditions, to the nucleotide sequence shown in the Figure sheets or a fragment of the sequence. Allelic variants can readily be determined by genetic locus of the encoding gene.

[0138] As used herein, the term “hybridizes under stringent conditions” is intended to describe conditions for hybridization and washing under which nucleotide sequences encoding a peptide at least 60-70% homologous to each other typically remain hybridized to each other. The conditions can be such that sequences at least about 60%, at least about 70%, or at least about 80% or more homologous to each other typically remain hybridized to each other. Such stringent conditions are known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. One example of stringent hybridization conditions are hybridization in 6× sodium chloride/sodium citrate (SSC) at about 45 C, followed by one or more washes in 0.2×SSC, 0.1% SDS at 50-65 C. Examples of moderate to low stringency hybridization conditions are well known in the art.

[0139] Nucleic Acid Molecule Uses

[0140] The nucleic acid molecules of the present invention are useful for probes, primers, chemical intermediates, and in biological assays. The nucleic acid molecules are useful as a hybridization probe for messenger RNA, transcript/cDNA and genomic DNA to isolate full-length cDNA and genomic clones encoding the peptide described in FIG. 2 and to isolate cDNA and genomic clones that correspond to variants (alleles, orthologs, etc.) producing the same or related peptides shown in FIG. 2.

[0141] The probe can correspond to any sequence along the entire length of the nucleic acid molecules provided in the Figures. Accordingly, it could be derived from 5′ noncoding regions, the coding region, and 3′ noncoding regions. However, as discussed, fragments are not to be construed as encompassing fragments disclosed prior to the present invention.

[0142] The nucleic acid molecules are also useful as primers for PCR to amplify any given region of a nucleic acid molecule and are useful to synthesize antisense molecules of desired length and sequence.

[0143] The nucleic acid molecules are also useful for constructing recombinant vectors. Such vectors include expression vectors that express a portion of, or all of, the peptide sequences. Vectors also include insertion vectors, used to integrate into another nucleic acid molecule sequence, such as into the cellular genome, to alter in situ expression of a gene and/or gene product. For example, an endogenous coding sequence can be replaced via homologous recombination with all or part of the coding region containing one or more specifically introduced mutations.

[0144] The nucleic acid molecules are also useful for expressing antigenic portions of the proteins.

[0145] The nucleic acid molecules are also useful as probes for determining the chromosomal positions of the nucleic acid molecules by means of in situ hybridization methods.

[0146] The nucleic acid molecules are also useful in making vectors containing the gene regulatory regions of the nucleic acid molecules of the present invention.

[0147] The nucleic acid molecules are also useful for designing ribozymes corresponding to all, or a part, of the mRNA produced from the nucleic acid molecules described herein.

[0148] The nucleic acid molecules are also useful for making vectors that express part, or all, of the peptides.

[0149] The nucleic acid molecules are also useful for constructing host cells expressing a part, or all, of the nucleic acid molecules and peptides.

[0150] The nucleic acid molecules are also useful for constructing transgenic animals expressing all, or a part, of the nucleic acid molecules and peptides.

[0151] The nucleic acid molecules are also useful as hybridization probes for determining the presence, level, form and distribution of nucleic acid expression. Accordingly, the probes can be used to detect the presence of, or to determine levels of, a specific nucleic acid molecule in cells, tissues, and in organisms. The nucleic acid whose level is determined can be DNA or RNA. Accordingly, probes corresponding to the peptides described herein can be used to assess expression and/or gene copy number in a given cell, tissue, or organism. These uses are relevant for diagnosis of disorders involving an increase or decrease in secreted protein expression relative to normal results.

[0152] In vitro techniques for detection of mRNA include Northern hybridizations and in situ hybridizations. In vitro techniques for detecting DNA include Southern hybridizations and in situ hybridization.

[0153] Probes can be used as a part of a diagnostic test kit for identifying cells or tissues that express a secreted protein, such as by measuring a level of a secreted protein-encoding nucleic acid in a sample of cells from a subject e.g., mRNA or genomic DNA, or determining if a secreted protein gene has been mutated.

[0154] Nucleic acid expression assays are useful for drug screening to identify compounds that modulate secreted protein nucleic acid expression.

[0155] The invention thus provides a method for identifying a compound that can be used to treat a disorder associated with nucleic acid expression of the secreted protein gene, particularly biological and pathological processes that are mediated by the secreted protein in cells and tissues that express it. The method typically includes assaying the ability of the compound to modulate the expression of the secreted protein nucleic acid and thus identifying a compound that can be used to treat a disorder characterized by undesired secreted protein nucleic acid expression. The assays can be performed in cell-based and cell-free systems. Cell-based assays include cells naturally expressing the secreted protein nucleic acid or recombinant cells genetically engineered to express specific nucleic acid sequences.

[0156] Thus, modulators of secreted protein gene expression can be identified in a method wherein a cell is contacted with a candidate compound and the expression of mRNA determined. The level of expression of secreted protein mRNA in the presence of the candidate compound is compared to the level of expression of secreted protein mRNA in the absence of the candidate compound. The candidate compound can then be identified as a modulator of nucleic acid expression based on this comparison and be used, for example to treat a disorder characterized by aberrant nucleic acid expression. When expression of mRNA is statistically significantly greater in the presence of the candidate compound than in its absence, the candidate compound is identified as a stimulator of nucleic acid expression. When nucleic acid expression is statistically significantly less in the presence of the candidate compound than in its absence, the candidate compound is identified as an inhibitor of nucleic acid expression.

[0157] The invention further provides methods of treatment, with the nucleic acid as a target, using a compound identified through drug screening as a gene modulator to modulate secreted protein nucleic acid expression in cells and tissues that express the secreted protein. Modulation includes both up-regulation (i.e. activation or agonization) or down-regulation (suppression or antagonization) or nucleic acid expression.

[0158] Alternatively, a modulator for secreted protein nucleic acid expression can be a small molecule or drug identified using the screening assays described herein as long as the drug or small molecule inhibits the secreted protein nucleic acid expression in the cells and tissues that express the protein.

[0159] The nucleic acid molecules are also useful for monitoring the effectiveness of modulating compounds on the expression or activity of the secreted protein gene in clinical trials or in a treatment regimen. Thus, the gene expression pattern can serve as a barometer for the continuing effectiveness of treatment with the compound, particularly with compounds to which a patient can develop resistance. The gene expression pattern can also serve as a marker indicative of a physiological response of the affected cells to the compound. Accordingly, such monitoring would allow either increased administration of the compound or the administration of alternative compounds to which the patient has not become resistant. Similarly, if the level of nucleic acid expression falls below a desirable level, administration of the compound could be commensurately decreased.

[0160] The nucleic acid molecules are also useful in diagnostic assays for qualitative changes in secreted protein nucleic acid expression, and particularly in qualitative changes that lead to pathology. The nucleic acid molecules can be used to detect mutations in secreted protein genes and gene expression products such as mRNA. The nucleic acid molecules can be used as hybridization probes to detect naturally occurring genetic mutations in the secreted protein gene and thereby to determine whether a subject with the mutation is at risk for a disorder caused by the mutation. Mutations include deletion, addition, or substitution of one or more nucleotides in the gene, chromosomal rearrangement, such as inversion or transposition, modification of genomic DNA, such as aberrant methylation patterns or changes in gene copy number, such as amplification. Detection of a mutated form of the secreted protein gene associated with a dysfunction provides a diagnostic tool for an active disease or susceptibility to disease when the disease results from overexpression, underexpression, or altered expression of a secreted protein.

[0161] Individuals carrying mutations in the secreted protein gene can be detected at the nucleic acid level by a variety of techniques. Genomic DNA can be analyzed directly or can be amplified by using PCR prior to analysis. RNA or cDNA can be used in the same way. In some uses, detection of the mutation involves the use of a probe/primer in a polymerase chain reaction (PCR) (see, e.g. U.S. Pat. Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegran et al., Science 241:1077-1080 (1988); and Nakazawa et al., PNAS 91:360-364 (1994)), the latter of which can be particularly useful for detecting point mutations in the gene (see Abravaya et al., Nucleic Acids Res. 23:675-682 (1995)). This method can include the steps of collecting a sample of cells from a patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells of the sample, contacting the nucleic acid sample with one or more primers which specifically hybridize to a gene under conditions such that hybridization and amplification of the gene (if present) occurs, and detecting the presence or absence of an amplification product, or detecting the size of the amplification product and comparing the length to a control sample. Deletions and insertions can be detected by a change in size of the amplified product compared to the normal genotype. Point mutations can be identified by hybridizing amplified DNA to normal RNA or antisense DNA sequences.

[0162] Alternatively, mutations in a secreted protein gene can be directly identified, for example, by alterations in restriction enzyme digestion patterns determined by gel electrophoresis.

[0163] Further, sequence-specific ribozymes (U.S. Pat. No. 5,498,531) can be used to score for the presence of specific mutations by development or loss of a ribozyme cleavage site. Perfectly matched sequences can be distinguished from mismatched sequences by nuclease cleavage digestion assays or by differences in melting temperature.

[0164] Sequence changes at specific locations can also be assessed by nuclease protection assays such as RNase and S1 protection or the chemical cleavage method. Furthermore, sequence differences between a mutant secreted protein gene and a wild-type gene can be determined by direct DNA sequencing. A variety of automated sequencing procedures can be utilized when performing the diagnostic assays (Naeve, C. W., (1995) Biotechniques 19:448), including sequencing by mass spectrometry (see, e.g., PCT International Publication No. WO 94/16101; Cohen et al., Adv. Chromatogr. 36:127-162 (1996); and Griffin et al., Appl. Biochem. Biotechnol. 38:147-159 (1993)).

[0165] Other methods for detecting mutations in the gene include methods in which protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA duplexes (Myers et al., Science 230:1242 (1985)); Cotton et al., PNAS 85:4397 (1988); Saleeba et al., Meth. Enzymol. 217:286-295 (1992)), electrophoretic mobility of mutant and wild type nucleic acid is compared (Orita et al., PNAS 86:2766 (1989); Cotton et al., Mutat. Res. 285:125-144 (1993); and Hayashi et al., Genet. Anal. Tech. Appl. 9:73-79 (1992)), and movement of mutant or wild-type fragments in polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel electrophoresis (Myers et al., Nature 313:495 (1985)). Examples of other techniques for detecting point mutations include selective oligonucleotide hybridization, selective amplification, and selective primer extension.

[0166] The nucleic acid molecules are also useful for testing an individual for a genotype that while not necessarily causing the disease, nevertheless affects the treatment modality. Thus, the nucleic acid molecules can be used to study the relationship between an individual's genotype and the individual's response to a compound used for treatment (pharmacogenomic relationship). Accordingly, the nucleic acid molecules described herein can be used to assess the mutation content of the secreted protein gene in an individual in order to select an appropriate compound or dosage regimen for treatment.

[0167] Thus nucleic acid molecules displaying genetic variations that affect treatment provide a diagnostic target that can be used to tailor treatment in an individual. Accordingly, the production of recombinant cells and animals containing these polymorphisms allow effective clinical design of treatment compounds and dosage regimens.

[0168] The nucleic acid molecules are thus useful as antisense constructs to control secreted protein gene expression in cells, tissues, and organisms. A DNA antisense nucleic acid molecule is designed to be complementary to a region of the gene involved in transcription, preventing transcription and hence production of secreted protein. An antisense RNA or DNA nucleic acid molecule would hybridize to the mRNA and thus block translation of mRNA into secreted protein.

[0169] Alternatively, a class of antisense molecules can be used to inactivate mRNA in order to decrease expression of secreted protein nucleic acid. Accordingly, these molecules can treat a disorder characterized by abnormal or undesired secreted protein nucleic acid expression. This technique involves cleavage by means of ribozymes containing nucleotide sequences complementary to one or more regions in the mRNA that attenuate the ability of the mRNA to be translated. Possible regions include coding regions and particularly coding regions corresponding to the catalytic and other functional activities of the secreted protein, such as substrate binding.

[0170] The nucleic acid molecules also provide vectors for gene therapy in patients containing cells that are aberrant in secreted protein gene expression. Thus, recombinant cells, which include the patient's cells that have been engineered ex vivo and returned to the patient, are introduced into an individual where the cells produce the desired secreted protein to treat the individual.

[0171] The invention also encompasses kits for detecting the presence of a secreted protein nucleic acid in a biological sample. For example, the kit can comprise reagents such as a labeled or labelable nucleic acid or agent capable of detecting secreted protein nucleic acid in a biological sample; means for determining the amount of secreted protein nucleic acid in the sample; and means for comparing the amount of secreted protein nucleic acid in the sample with a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect secreted protein mRNA or DNA.

[0172] Nucleic Acid Arrays

[0173] The present invention further provides nucleic acid detection kits, such as arrays or microarrays of nucleic acid molecules that are based on the sequence information provided in FIGS. 1 and 3 (SEQ ID NOS:1 and 3).

[0174] As used herein “Arrays” or “Microarrays” refers to an array of distinct polynucleotides or oligonucleotides synthesized on a substrate, such as paper, nylon or other type of membrane, filter, chip, glass slide, or any other suitable solid support. In one embodiment, the microarray is prepared and used according to the methods described in U.S. Pat. No. 5,837,832, Chee et al., PCT application WO95/11995 (Chee et al.), Lockhart, D. J. et al. (1996; Nat. Biotech. 14: 1675-1680) and Schena, M. et al. (1996; Proc. Natl. Acad. Sci. 93: 10614-10619), all of which are incorporated herein in their entirety by reference. In other embodiments, such arrays are produced by the methods described by Brown et al., U.S. Pat. No. 5,807,522.

[0175] The microarray or detection kit is preferably composed of a large number of unique, single-stranded nucleic acid sequences, usually either synthetic antisense oligonucleotides or fragments of cDNAs, fixed to a solid support. The oligonucleotides are preferably about 6-60 nucleotides in length, more preferably 15-30 nucleotides in length, and most preferably about 20-25 nucleotides in length. For a certain type of microarray or detection kit, it may be preferable to use oligonucleotides that are only 7-20 nucleotides in length. The microarray or detection kit may contain oligonucleotides that cover the known 5′, or 3′, sequence, sequential oligonucleotides which cover the full length sequence; or unique oligonucleotides selected from particular areas along the length of the sequence. Polynucleotides used in the microarray or detection kit may be oligonucleotides that are specific to a gene or genes of interest.

[0176] In order to produce oligonucleotides to a known sequence for a microarray or detection kit, the gene(s) of interest (or an ORF identified from the contigs of the present invention) is typically examined using a computer algorithm which starts at the 5′ or at the 3′ end of the nucleotide sequence. Typical algorithms will then identify oligomers of defined length that are unique to the gene, have a GC content within a range suitable for hybridization, and lack predicted secondary structure that may interfere with hybridization. In certain situations it may be appropriate to use pairs of oligonucleotides on a microarray or detection kit. The “pairs” will be identical, except for one nucleotide that preferably is located in the center of the sequence. The second oligonucleotide in the pair (mismatched by one) serves as a control. The number of oligonucleotide pairs may range from two to one million. The oligomers are synthesized at designated areas on a substrate using a light-directed chemical process. The substrate may be paper, nylon or other type of membrane, filter, chip, glass slide or any other suitable solid support.

[0177] In another aspect, an oligonucleotide may be synthesized on the surface of the substrate by using a chemical coupling procedure and an ink jet application apparatus, as described in PCT application WO95/251116 (Baldeschweiler et al.) which is incorporated herein in its entirety by reference. In another aspect, a “gridded” array analogous to a dot (or slot) blot may be used to arrange and link cDNA fragments or oligonucleotides to the surface of a substrate using a vacuum system, thermal, UV, mechanical or chemical bonding procedures. An array, such as those described above, may be produced by hand or by using available devices (slot blot or dot blot apparatus), materials (any suitable solid support), and machines (including robotic instruments), and may contain 8, 24, 96, 384, 1536, 6144 or more oligonucleotides, or any other number between two and one million which lends itself to the efficient use of commercially available instrumentation.

[0178] In order to conduct sample analysis using a microarray or detection kit, the RNA or DNA from a biological sample is made into hybridization probes. The mRNA is isolated, and cDNA is produced and used as a template to make antisense RNA (aRNA). The aRNA is amplified in the presence of fluorescent nucleotides, and labeled probes are incubated with the microarray or detection kit so that the probe sequences hybridize to complementary oligonucleotides of the microarray or detection kit. Incubation conditions are adjusted so that hybridization occurs with precise complementary matches or with various degrees of less complementarity. After removal of nonhybridized probes, a scanner is used to determine the levels and patterns of fluorescence. The scanned images are examined to determine degree of complementarity and the relative abundance of each oligonucleotide sequence on the microarray or detection kit. The biological samples may be obtained from any bodily fluids (such as blood, urine, saliva, phlegm, gastric juices, etc.), cultured cells, biopsies, or other tissue preparations. A detection system may be used to measure the absence, presence, and amount of hybridization for all of the distinct sequences simultaneously. This data may be used for large-scale correlation studies on the sequences, expression patterns, mutations, variants, or polymorphisms among samples.

[0179] Using such arrays, the present invention provides methods to identify the expression of the secreted proteins/peptides of the present invention. In detail, such methods comprise incubating a test sample with one or more nucleic acid molecules and assaying for binding of the nucleic acid molecule with components within the test sample. Such assays will typically involve arrays comprising many genes, at least one of which is a gene of the present invention and or alleles of the secreted protein gene of the present invention.

[0180] Conditions for incubating a nucleic acid molecule with a test sample vary. Incubation conditions depend on the format employed in the assay, the detection methods employed, and the type and nature of the nucleic acid molecule used in the assay. One skilled in the art will recognize that any one of the commonly available hybridization, amplification or array assay formats can readily be adapted to employ the novel fragments of the Human genome disclosed herein. Examples of such assays can be found in Chard, T, An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R. et al., Techniques in Immunocytochemistry, Academic Press, Orlando, Fla. Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1985).

[0181] The test samples of the present invention include cells, protein or membrane extracts of cells. The test sample used in the above-described method will vary based on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. Methods for preparing nucleic acid extracts or of cells are well known in the art and can be readily be adapted in order to obtain a sample that is compatible with the system utilized.

[0182] In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry out the assays of the present invention.

[0183] Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers which comprises: (a) a first container comprising one of the nucleic acid molecules that can bind to a fragment of the Human genome disclosed herein; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting presence of a bound nucleic acid.

[0184] In detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such containers include small glass containers, plastic containers, strips of plastic, glass or paper, or arraying material such as silica. Such containers allows one to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one compartment to another. Such containers will include a container which will accept the test sample, a container which contains the nucleic acid probe, containers which contain wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the bound probe. One skilled in the art will readily recognize that the previously unidentified secreted protein gene of the present invention can be routinely identified using the sequence information disclosed herein can be readily incorporated into one of the established kit formats which are well known in the art, particularly expression arrays.

[0185] Vectors/Host Cells

[0186] The invention also provides vectors containing the nucleic acid molecules described herein. The term “vector” refers to a vehicle, preferably a nucleic acid molecule, which can transport the nucleic acid molecules. When the vector is a nucleic acid molecule, the nucleic acid molecules are covalently linked to the vector nucleic acid. With this aspect of the invention, the vector includes a plasmid, single or double stranded phage, a single or double stranded RNA or DNA viral vector, or artificial chromosome, such as a BAC, PAC, YAC, or MAC.

[0187] A vector can be maintained in the host cell as an extrachromosomal element where it replicates and produces additional copies of the nucleic acid molecules. Alternatively, the vector may integrate into the host cell genome and produce additional copies of the nucleic acid molecules when the host cell replicates.

[0188] The invention provides vectors for the maintenance (cloning vectors) or vectors for expression (expression vectors) of the nucleic acid molecules. The vectors can function in prokaryotic or eukaryotic cells or in both (shuttle vectors).

[0189] Expression vectors contain cis-acting regulatory regions that are operably linked in the vector to the nucleic acid molecules such that transcription of the nucleic acid molecules is allowed in a host cell. The nucleic acid molecules can be introduced into the host cell with a separate nucleic acid molecule capable of affecting transcription. Thus, the second nucleic acid molecule may provide a trans-acting factor interacting with the cis-regulatory control region to allow transcription of the nucleic acid molecules from the vector. Alternatively, a trans-acting factor may be supplied by the host cell. Finally, a trans-acting factor can be produced from the vector itself. It is understood, however, that in some embodiments, transcription and/or translation of the nucleic acid molecules can occur in a cell-free system.

[0190] The regulatory sequence to which the nucleic acid molecules described herein can be operably linked include promoters for directing mRNA transcription. These include, but are not limited to, the left promoter from bacteriophage λ, the lac, TRP, and TAC promoters from E. coli, the early and late promoters from SV40, the CMV immediate early promoter, the adenovirus early and late promoters, and retrovirus long-terminal repeats.

[0191] In addition to control regions that promote transcription, expression vectors may also include regions that modulate transcription, such as repressor binding sites and enhancers. Examples include the SV40 enhancer, the cytomegalovirus immediate early enhancer, polyoma enhancer, adenovirus enhancers, and retrovirus LTR enhancers.

[0192] In addition to containing sites for transcription initiation and control, expression vectors can also contain sequences necessary for transcription termination and, in the transcribed region a ribosome binding site for translation. Other regulatory control elements for expression include initiation and termination codons as well as polyadenylation signals. The person of ordinary skill in the art would be aware of the numerous regulatory sequences that are useful in expression vectors. Such regulatory sequences are described, for example, in Sambrook et al., Molecular Cloning: A Laboratory Manual. 2nd. ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1989).

[0193] A variety of expression vectors can be used to express a nucleic acid molecule. Such vectors include chromosomal, episomal, and virus-derived vectors, for example vectors derived from bacterial plasmids, from bacteriophage, from yeast episomes, from yeast chromosomal elements, including yeast artificial chromosomes, from viruses such as baculoviruses, papovaviruses such as SV40, Vaccinia viruses, adenoviruses, poxviruses, pseudorabies viruses, and retroviruses. Vectors may also be derived from combinations of these sources such as those derived from plasmid and bacteriophage genetic elements, e.g. cosmids and phagemids. Appropriate cloning and expression vectors for prokaryotic and eukaryotic hosts are described in Sambrook et al., Molecular Cloning: A Laboratory Manual. 2nd. ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1989).

[0194] The regulatory sequence may provide constitutive expression in one or more host cells (i.e. tissue specific) or may provide for inducible expression in one or more cell types such as by temperature, nutrient additive, or exogenous factor such as a hormone or other ligand. A variety of vectors providing for constitutive and inducible expression in prokaryotic and eukaryotic hosts are well known to those of ordinary skill in the art.

[0195] The nucleic acid molecules can be inserted into the vector nucleic acid by well-known methodology. Generally, the DNA sequence that will ultimately be expressed is joined to an expression vector by cleaving the DNA sequence and the expression vector with one or more restriction enzymes and then ligating the fragments together. Procedures for restriction enzyme digestion and ligation are well known to those of ordinary skill in the art.

[0196] The vector containing the appropriate nucleic acid molecule can be introduced into an appropriate host cell for propagation or expression using well-known techniques. Bacterial cells include, but are not limited to, E. coli, Streptomyces, and Salmonella typhimurium. Eukaryotic cells include, but are not limited to, yeast, insect cells such as Drosophila, animal cells such as COS and CHO cells, and plant cells.

[0197] As described herein, it may be desirable to express the peptide as a fusion protein. Accordingly, the invention provides fusion vectors that allow for the production of the peptides. Fusion vectors can increase the expression of a recombinant protein, increase the solubility of the recombinant protein, and aid in the purification of the protein by acting for example as a ligand for affinity purification. A proteolytic cleavage site may be introduced at the junction of the fusion moiety so that the desired peptide can ultimately be separated from the fusion moiety. Proteolytic enzymes include, but are not limited to, factor Xa, thrombin, and enterokinase. Typical fusion expression vectors include pGEX (Smith et al., Gene 67:3140 (1988)), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose E binding protein, or protein A, respectively, to the target recombinant protein. Examples of suitable inducible non-fusion E. coli expression vectors include pTrc (Amann et al., Gene 69:301-315 (1988)) and pET 11d (Studier et al., Gene Expression Technology: Methods in Enzymology 185:60-89 (1990)).

[0198] Recombinant protein expression can be maximized in host bacteria by providing a genetic background wherein the host cell has an impaired capacity to proteolytically cleave the recombinant protein. (Gottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 119-128). Alternatively, the sequence of the nucleic acid molecule of interest can be altered to provide preferential codon usage for a specific host cell, for example E. coli. (Wada et al., Nucleic Acids Res. 20:2111-2118 (1992)).

[0199] The nucleic acid molecules can also be expressed by expression vectors that are operative in yeast. Examples of vectors for expression in yeast e.g., S. cerevisiae include pYepSec1 (Baldari, et al., EMBO J. 6:229-234 (1987)), pMFa (Kurjan et al., Cell 30:933-943(1982)), pJRY88 (Schultz et al., Gene 54:113-123 (1987)), and pYES2 (Invitrogen Corporation, San Diego, Calif.).

[0200] The nucleic acid molecules can also be expressed in insect cells using, for example, baculovirus expression vectors. Baculovirus vectors available for expression of proteins in cultured insect cells (e.g., Sf 9 cells) include the pAc series (Smith et al., Mol. Cell Biol. 3:2156-2165 (1983)) and the pVL series (Lucklow et al., Virology 170:31-39 (1989)).

[0201] In certain embodiments of the invention, the nucleic acid molecules described herein are expressed in mammalian cells using mammalian expression vectors. Examples of mammalian expression vectors include pCDM8 (Seed, B. Nature 329:840(1987)) and pMT2PC (Kaufman et al., EMBO J. 6:187-195 (1987)).

[0202] The expression vectors listed herein are provided by way of example only of the well-known vectors available to those of ordinary skill in the art that would be useful to express the nucleic acid molecules. The person of ordinary skill in the art would be aware of other vectors suitable for maintenance propagation or expression of the nucleic acid molecules described herein. These are found for example in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989.

[0203] The invention also encompasses vectors in which the nucleic acid sequences described herein are cloned into the vector in reverse orientation, but operably linked to a regulatory sequence that permits transcription of antisense RNA. Thus, an antisense transcript can be produced to all, or to a portion, of the nucleic acid molecule sequences described herein, including both coding and non-coding regions. Expression of this antisense RNA is subject to each of the parameters described above in relation to expression of the sense RNA (regulatory sequences, constitutive or inducible expression, tissue-specific expression).

[0204] The invention also relates to recombinant host cells containing the vectors described herein. Host cells therefore include prokaryotic cells, lower eukaryotic cells such as yeast, other eukaryotic cells such as insect cells, and higher eukaryotic cells such as mammalian cells.

[0205] The recombinant host cells are prepared by introducing the vector constructs described herein into the cells by techniques readily available to the person of ordinary skill in the art. These include, but are not limited to, calcium phosphate transfection, DEAE-dextran-mediated transfection, cationic lipid-mediated transfection, electroporation, transduction, infection, lipofection, and other techniques such as those found in Sambrook, et al. (Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989).

[0206] Host cells can contain more than one vector. Thus, different nucleotide sequences can be introduced on different vectors of the same cell. Similarly, the nucleic acid molecules can be introduced either alone or with other nucleic acid molecules that are not related to the nucleic acid molecules such as those providing trans-acting factors for expression vectors. When more than one vector is introduced into a cell, the vectors can be introduced independently, co-introduced or joined to the nucleic acid molecule vector.

[0207] In the case of bacteriophage and viral vectors, these can be introduced into cells as packaged or encapsulated virus by standard procedures for infection and transduction. Viral vectors can be replication-competent or replication-defective. In the case in which viral replication is defective, replication will occur in host cells providing functions that complement the defects.

[0208] Vectors generally include selectable markers that enable the selection of the subpopulation of cells that contain the recombinant vector constructs. The marker can be contained in the same vector that contains the nucleic acid molecules described herein or may be on a separate vector. Markers include tetracycline or ampicillin-resistance genes for prokaryotic host cells and dihydrofolate reductase or neomycin resistance for eukaryotic host cells. However, any marker that provides selection for a phenotypic trait will be effective.

[0209] While the mature proteins can be produced in bacteria, yeast, mammalian cells, and other cells under the control of the appropriate regulatory sequences, cell-free transcription and translation systems can also be used to produce these proteins using RNA derived from the DNA constructs described herein.

[0210] Where secretion of the peptide is desired, which is difficult to achieve with multi-transmembrane domain containing proteins such as kinases, appropriate secretion signals are incorporated into the vector. The signal sequence can be endogenous to the peptides or heterologous to these peptides.

[0211] Where the peptide is not secreted into the medium, which is typically the case with kinases, the protein can be isolated from the host cell by standard disruption procedures, including freeze thaw, sonication, mechanical disruption, use of lysing agents and the like. The peptide can then be recovered and purified by well-known purification methods including ammonium sulfate precipitation, acid extraction, anion or cationic exchange chromatography, phosphocellulose chromatography, hydrophobic-interaction chromatography, affinity chromatography, hydroxylapatite chromatography, lectin chromatography, or high performance liquid chromatography.

[0212] It is also understood that depending upon the host cell in recombinant production of the peptides described herein, the peptides can have various glycosylation patterns, depending upon the cell, or maybe non-glycosylated as when produced in bacteria. In addition, the peptides may include an initial modified methionine in some cases as a result of a host-mediated process.

[0213] Uses of Vectors and Host Cells

[0214] The recombinant host cells expressing the peptides described herein have a variety of uses. First, the cells are useful for producing a secreted protein or peptide that can be further purified to produce desired amounts of secreted protein or fragments. Thus, host cells containing expression vectors are useful for peptide production.

[0215] Host cells are also useful for conducting cell-based assays involving the secreted protein or secreted protein fragments, such as those described above as well as other formats known in the art. Thus, a recombinant host cell expressing a native secreted protein is useful for assaying compounds that stimulate or inhibit secreted protein function.

[0216] Host cells are also useful for identifying secreted protein mutants in which these functions are affected. If the mutants naturally occur and give rise to a pathology, host cells containing the mutations are useful to assay compounds that have a desired effect on the mutant secreted protein (for example, stimulating or inhibiting function) which may not be indicated by their effect on the native secreted protein.

[0217] Genetically engineered host cells can be further used to produce non-human transgenic animals. A transgenic animal is preferably a mammal, for example a rodent, such as a rat or mouse, in which one or more of the cells of the animal include a transgene. A transgene is exogenous DNA which is integrated into the genome of a cell from which a transgenic animal develops and which remains in the genome of the mature animal in one or more cell types or tissues of the transgenic animal. These animals are useful for studying the function of a secreted protein and identifying and evaluating modulators of secreted protein activity. Other examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, and amphibians.

[0218] A transgenic animal can be produced by introducing nucleic acid into the male pronuclei of a fertilized oocyte, e.g., by microinjection, retroviral infection, and allowing the oocyte to develop in a pseudopregnant female foster animal. Any of the secreted protein nucleotide sequences can be introduced as a transgene into the genome of a non-human animal, such as a mouse.

[0219] Any of the regulatory or other sequences useful in expression vectors can form part of the transgenic sequence. This includes intronic sequences and polyadenylation signals, if not already included. A tissue-specific regulatory sequence(s) can be operably linked to the transgene to direct expression of the secreted protein to particular cells.

[0220] Methods for generating transgenic animals via embryo manipulation and microinjection, particularly animals such as mice, have become conventional in the art and are described, for example, in U.S. Pat. Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No. 4,873,191 by Wagner et al. and in Hogan, B., Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). Similar methods are used for production of other transgenic animals. A transgenic founder animal can be identified based upon the presence of the transgene in its genome and/or expression of transgenic mRNA in tissues or cells of the animals. A transgenic founder animal can then be used to breed additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene can further be bred to other transgenic animals carrying other transgenes. A transgenic animal also includes animals in which the entire animal or tissues in the animal have been produced using the homologously recombinant host cells described herein.

[0221] In another embodiment, transgenic non-human animals can be produced which contain selected systems that allow for regulated expression of the transgene. One example of such a system is the cre/loxP recombinase system of bacteriophage P1. For a description of the cre/loxP recombinase system, see, e.g., Lakso et al. PNAS 89:6232-6236 (1992). Another example of a recombinase system is the FLP recombinase system of S. cerevisiae (O'Gorman et al. Science 251:1351-1355 (1991). If a cre/loxP recombinase system is used to regulate expression of the transgene, animals containing transgenes encoding both the Cre recombinase and a selected protein is required. Such animals can be provided through the construction of “double” transgenic animals, e.g., by mating two transgenic animals, one containing a transgene encoding a selected protein and the other containing a transgene encoding a recombinase.

[0222] Clones of the non-human transgenic animals described herein can also be produced according to the methods described in Wilmut, I. et al. Nature 385:810-813 (1997) and PCT International Publication Nos. WO 97/07668 and WO 97/07669. In brief, a cell, e.g., a somatic cell, from the transgenic animal can be isolated and induced to exit the growth cycle and enter G_(o) phase. The quiescent cell can then be fused, e.g., through the use of electrical pulses, to an enucleated oocyte from an animal of the same species from which the quiescent cell is isolated. The reconstructed oocyte is then cultured such that it develops to morula or blastocyst and then transferred to pseudopregnant female foster animal. The offspring born of this female foster animal will be a clone of the animal from which the cell, e.g., the somatic cell, is isolated.

[0223] Transgenic animals containing recombinant cells that express the peptides described herein are useful to conduct the assays described herein in an in vivo context. Accordingly, the various physiological factors that are present in vivo and that could effect substrate binding, secreted protein activation, and signal transduction, may not be evident from in vitro cell-free or cell-based assays. Accordingly, it is useful to provide non-human transgenic animals to assay in vivo secreted protein function, including substrate interaction, the effect of specific mutant secreted proteins on secreted protein function and substrate interaction, and the effect of chimeric secreted proteins. It is also possible to assess the effect of null mutations, that is, mutations that substantially or completely eliminate one or more secreted protein functions.

[0224] All publications and patents mentioned in the above specification are herein incorporated by reference. Various modifications and variations of the described method and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the above-described modes for carrying out the invention which are obvious to those skilled in the field of molecular biology or related fields are intended to be within the scope of the following claims.

1 3 1 1245 DNA Human 1 atgggaccct ggtcagtggt ggtgctggtg tgcggcatga agcagctggg gcaggccctc 60 caggcctcag tttccctgtc tatcatcaca gagaatcagg gcaaaaggtg tcccttctgt 120 ggagcccaga acctgatgac ctgtcagaat cccacccttc cttctgtctc ccatcgaagt 180 cctccaggaa atgcagctgt ttcagtgaca gggggtgatt gccatcttcc aactgaggag 240 gaatttgggg ttttggtcca gtccatgaag tgtgacacag tcagaataaa aggcgtcctt 300 caaggcccta ccacagcccc tcctctcatg accagtgaag gcaatgtaac tgcagaagac 360 actgaggagg caattcgggc ttttgtgtac gctgtggctg ctgcttctgc tgccgaagct 420 tggcattgga gacacctcgt cctcctctca ggacagatcc atgaacccat cggcagcggc 480 gggaatataa tcaacactaa taaaggagga aggagctgtc agaatcctgc ccttccgtct 540 ccagatcaaa gtccttcagg aaatgcaact acttcagtga caagagataa ttatcatctt 600 ctgacagagg aggaatttgg ggtttggtcc cagtccatga agtggcacag tcagaataaa 660 agtggcggga gcgtccccgt gagaggcccg acccaggagc catgttctga atctcaaatt 720 ttgaaagaat cttttgtccc acccacaaca cccaaagaaa ataataaaca ggagagggag 780 gatgaaaatt ggcgtctacc accccctcca gtagcagaaa cacctgtacc atctccttca 840 gtaacagaaa tagagacccc actgcaaaga attccgcgga ctgctaccat agctggagag 900 cccttaggac attgcacttt cactatttct ccggcattcg tacattctgt gctcaacaaa 960 cggaagcggc agctggagct gctgctccgg gaggtggagt ggcctggcag agggcacatg 1020 gctgccacct gctgcaagct ccaggtagaa gggcaggaca gaaccatgag cctagcggca 1080 gcgccggttc gcgaagctcc ccctccgcca acgggcgcct cctcagagcc gtccgtgccc 1140 gccctgccgg gagctgaccc gcagcgcagt gcagagttgc tcctgttggc ggtgaccagg 1200 gagggactgg agcggcggat catctccagg aagcgggctg agtag 1245 2 414 PRT Human 2 Met Gly Pro Trp Ser Val Val Val Leu Val Cys Gly Met Lys Gln Leu 1 5 10 15 Gly Gln Ala Leu Gln Ala Ser Val Ser Leu Ser Ile Ile Thr Glu Asn 20 25 30 Gln Gly Lys Arg Cys Pro Phe Cys Gly Ala Gln Asn Leu Met Thr Cys 35 40 45 Gln Asn Pro Thr Leu Pro Ser Val Ser His Arg Ser Pro Pro Gly Asn 50 55 60 Ala Ala Val Ser Val Thr Gly Gly Asp Cys His Leu Pro Thr Glu Glu 65 70 75 80 Glu Phe Gly Val Leu Val Gln Ser Met Lys Cys Asp Thr Val Arg Ile 85 90 95 Lys Gly Val Leu Gln Gly Pro Thr Thr Ala Pro Pro Leu Met Thr Ser 100 105 110 Glu Gly Asn Val Thr Ala Glu Asp Thr Glu Glu Ala Ile Arg Ala Phe 115 120 125 Val Tyr Ala Val Ala Ala Ala Ser Ala Ala Glu Ala Trp His Trp Arg 130 135 140 His Leu Val Leu Leu Ser Gly Gln Ile His Glu Pro Ile Gly Ser Gly 145 150 155 160 Gly Asn Ile Ile Asn Thr Asn Lys Gly Gly Arg Ser Cys Gln Asn Pro 165 170 175 Ala Leu Pro Ser Pro Asp Gln Ser Pro Ser Gly Asn Ala Thr Thr Ser 180 185 190 Val Thr Arg Asp Asn Tyr His Leu Leu Thr Glu Glu Glu Phe Gly Val 195 200 205 Trp Ser Gln Ser Met Lys Trp His Ser Gln Asn Lys Ser Gly Gly Ser 210 215 220 Val Pro Val Arg Gly Pro Thr Gln Glu Pro Cys Ser Glu Ser Gln Ile 225 230 235 240 Leu Lys Glu Ser Phe Val Pro Pro Thr Thr Pro Lys Glu Asn Asn Lys 245 250 255 Gln Glu Arg Glu Asp Glu Asn Trp Arg Leu Pro Pro Pro Pro Val Ala 260 265 270 Glu Thr Pro Val Pro Ser Pro Ser Val Thr Glu Ile Glu Thr Pro Leu 275 280 285 Gln Arg Ile Pro Arg Thr Ala Thr Ile Ala Gly Glu Pro Leu Gly His 290 295 300 Cys Thr Phe Thr Ile Ser Pro Ala Phe Val His Ser Val Leu Asn Lys 305 310 315 320 Arg Lys Arg Gln Leu Glu Leu Leu Leu Arg Glu Val Glu Trp Pro Gly 325 330 335 Arg Gly His Met Ala Ala Thr Cys Cys Lys Leu Gln Val Glu Gly Gln 340 345 350 Asp Arg Thr Met Ser Leu Ala Ala Ala Pro Val Arg Glu Ala Pro Pro 355 360 365 Pro Pro Thr Gly Ala Ser Ser Glu Pro Ser Val Pro Ala Leu Pro Gly 370 375 380 Ala Asp Pro Gln Arg Ser Ala Glu Leu Leu Leu Leu Ala Val Thr Arg 385 390 395 400 Glu Gly Leu Glu Arg Arg Ile Ile Ser Arg Lys Arg Ala Glu 405 410 3 100543 DNA Human misc_feature (1)...(100543) n = A,T,C or G 3 agccagacta ggagtgagcc agaagagggg aaggatggtg gaggcacagg ctgcactcta 60 ctggtgcccc agacccagac tgcatgccca ggctgcagtc caaaggatac tcggtgcggg 120 tccctgtccc ccatagcatc ttagatcagc tgctgaggct ggagcttctt ccattccttg 180 agcatcaggg gtgtgtatca tttccaaggg ttttcagaca atccctggtg acccctggca 240 gggggcggtt atcatggcga tcggtccatg gccttgcctc caagcagcac ccagcaatcc 300 ccatgcccac caatgcacta aatgtttgtg gtgggcctct ttctggaagc tcaccttctc 360 ctcctgtttg ccctccatct tccccaaacc agtacttctg gccatcctcc ttgtcaccac 420 aatgggaaaa ctgggtcctg gagactcaga aaccactgtg caggcctcga gtcttcccct 480 gtcctggcta acagggcatg gaatcagaga gaaaagtcat cttccacctc ctgaaggctg 540 ccagcgtcag ggcttggcac actgaggctg acaggggcct tctgaaggcc agaggagatg 600 gcccgggaca taaggctgaa gcaacctgtc tgagccaaag atctgtttgt gtcctcctga 660 atcttagtgg ccttctaaag gcgggtgtga tcagccatgg gtatcagaga cactggagtc 720 cagtagctgc taggtgggac acgggcacaa tttcacttgc agaccagctg cacggagtgg 780 ataaagagag agttctgtgt gggaatctcc tttggtggat catcagggag gtgaagtctt 840 tgtcatagcc tcatatccag cttgtgtgat accaattcca gtgaagctgg aacaagctgg 900 cactgctcaa acaggcctac caagacatca tgtttttttt ttttttttcc accaaacctg 960 gacctgaatg gggatgtgga cacacataga gtccagagga tgggaccctg gtcagtggtg 1020 gtgctggtgt gcggcatgaa gcagctgggg caggccctcc aggtgggagg aggagccagc 1080 ctctcctgtg gggtcctggg caagtcattt ccctccttga cgctctgttt cctcatctgg 1140 aaaatccagg aagcctgttg tgcagtcctc agagggtcat gataaggtgc aaaggaggaa 1200 gaaattttga gagttcttgt gccttcctct tcctgagagg gatgatggtg agaacagtgt 1260 ggataaccac atgcaactga gtccccaaaa ggccctgtga ggaaggtgtt gttcccatca 1320 tgcttcccag atgaggaaac agtctcaggg aggcctggct gcatgcccaa gggttacaca 1380 cacactgaat gttggagctg ggagtaaatc tagagtcagg gctcactgga ggtggtggga 1440 gcattgcacc agctgcctca tgcagttatc caagatctga ggtcaggggt ctggggtata 1500 actcggtgag agaaccacag tctcactgac cctgtttctg actctgctag gtagggaact 1560 ttttagagtc cagatgctgc tgttgctgct ccagggagtt aaggacaggg cttttccctc 1620 ccagctggag ttctccacat caagagacca gagtgtcttc ctcctctagc cctgccctcc 1680 tgtggccaca gtacttggaa tgccttcatt aacgtgacca gagaaagaga cttagagagc 1740 tgagttcaca gtggattggg aaaagagtga ggctggggac aacctgactt tgtgacttat 1800 taaaatccta ccttagagtt aactagaaag agcttggtgt ctttggaggc cagctgtgga 1860 aagaggaggt aaagatgtct tgtaaggagg ctgagggtcc ttggctgctc caagtgggag 1920 acttgggtag gggcgtcaag gagacatggg ttggtgagag ttcaaaggac aggccttatg 1980 caggaagctg agaatttgtg ctcatcactg ctaccacctt ggaggaacag ggagctgtac 2040 cctctctcgg cacacttctc actgacctga tgatgctgga cattgccata gaggattata 2100 atgatgtgtg tgacactggg gcaggcaggc agtttgggaa ccaagatcct gaagcttgtg 2160 agaacagagt cctggaataa gccctgagat tgcagcactc agcaaacctg ctttcctgag 2220 agcctaccag atgcccatgt gaaagggggg tgtcagttcc atcttatcta tgagtgacag 2280 aggctccagc aaagacacca gtcactgttc cttgagtagt gaagctgcag agctaagtct 2340 cagcctctgc cttaggtggt acccattgga gaaaacagaa attgggcccc tcctaagaca 2400 ggaagttcct aaattgttga gtgccttttt tgtttcctag gcctcagttt ccctgtctat 2460 catcacagag aatcagggca aaaggtgtcc cttctgtgga gcccagaacc tgatgtaggt 2520 ccaagtcctg ttttatgcac atgccttgac cctggcagcc ctggcggtgg tgcagcatgg 2580 gaagtacagg ggatgagggc tagtcatggg ccagggggtc tttctgaggg atcttggctg 2640 tctaccttcc aggaaaatat aatcaacact aataaaggag gaaggagagc agctggggtc 2700 tcactttgag ggaggctggg gacgtgacag tcagacacca ccctgaagag gccactcgct 2760 ggcttcaccc tctgcatctt aaagttattg ggaaggtttg atacacagag gagatccatt 2820 ctaatggagg gtttgattag gggactagaa tcaacaataa attcctagat gaggaactgt 2880 ttatatccaa ctctgagaac aggttagggt tacatgggat tggaagagag ggtggggtcc 2940 cttaaaagaa aagccccaga aactcactgc tgctctatcc ctcccctata agttctcttt 3000 gttatcttcc acccaggacc tgtcagaatc ccacccttcc ttctgtctcc catcgaagtc 3060 ctccaggaaa tgcagctgtt tcagtgacag ggggtgattg ccatcttcca actgaggagg 3120 aatttggggt tttggtccag tccatgaagt gtgacacagt cagaataaaa ggtgagggcc 3180 taacagatta gcagacggta ggagaagact atcttgcagc cagcttcaga gagcctgtgg 3240 ccatggctcc caggtcaaca ttaggccctg ttgcctggga acccctgggc aggcagtggg 3300 aaggttgagg tgtggctcct ggtagcctca gaactgccac tatttcctga agctcctact 3360 tgttctgtca gctaagcccc catcccagta ggccagcaac accctcaaga ccaagaacag 3420 gccatggtga atctcagggc cactaagtac ctgggctggc aggggcagag tgcctcaggg 3480 ctcagtgttg tttgggctga gcatgggctt tgggagtcag acagctgcac tgggctccca 3540 gctgcaccat ggccagccct gtgtatgggg cactgctctg taacttgaga caccatagtc 3600 ataaatataa cacacccttc taactgcttt ttcttttttg tctatttttc tctataatcc 3660 ccatgtacta ctgatctgtt tatttaaatt aataaacatg ttatacagtg tatattgttc 3720 ttcctcatga tttctttact atattgtgtg attccactca tatgaggttc ttatggatgt 3780 cccattcaca gaaacacaaa gtagaagagt agttagttcc caggggccaa aagaaggtaa 3840 atgggggctg tttctatctt ttatttttat ttttgcaaaa tgaacaaaat tccctatgaa 3900 tgtggatgat ggttgcagaa caatctgagc atgattaatt cctctgactt gcacgttaga 3960 aattgttaaa atagttaatt ttatgtatat tttaccacaa tgtaaaaaag gaatttttaa 4020 atgaacagac tgtagataca tgcaacagca taaatgaata tcacaaatat aatcttgcat 4080 ttaaaaattg atgtaaaagt atccaaacta tataatttca tttatactaa atccaaaaat 4140 caaaactgac attcttgctt tcactaatgg gaattagcta gttaaactaa cactctcaga 4200 gagaaaaatg atgaatccta gataaaatag tatatatcat tataaacact tctatatata 4260 atacatatat gagatatgtg tgtataagaa gtgaatgagg atgtcccctg tgccctcctt 4320 aggagagaca agaattgaag ttataatcca ggccaattag cactctcttt aaaaatcaac 4380 actcttcaaa gggacacaac agaatccaga gtctctataa ctcttgtata cagtctcttg 4440 tacacaattt tcaaattcgt gagatgggtg aagacacatg aaaatgcaat acatacacaa 4500 gataaaaggc aggcagtaga catctccaag atatccaaga tgtaatcagc agacaagaat 4560 ttgaaggcag ctattacaag tatgctaatg gaggcaaagg aaaaaatata cttataaagg 4620 aacagatgtg gaacctcagc agagaaataa aaaatagcca aatagaaaaa taagacacaa 4680 aaagaataat tttgagctta tctatagatc agaaacaaaa cacacaacaa tagaaattat 4740 tcaatctgaa gatacaagta aaaaaaaaag tttaaggaaa atgaacccag ccttacagac 4800 ctctcatggg gcactctggg agttgtagtc tcttctccgg ttccaaacgg tggctgttgt 4860 ggctgcagga taacagtccc agattgagac agggcagagg ctgtgtgcag ccctacagga 4920 aggggcaggg tggtgtaggc ctcttcactt accaagattt gctggccatt gattccgtgc 4980 caaacccttc ccaaggggat tgagtcagga gaggatcttg agagtcactc agggtcttcc 5040 cagagcatct gtgcctcctc cagcccacgg agctgcctga tttcctaagt ggctgtggga 5100 actggtctga agtaccagac gctgtctact gtgctgctgc cctctgttct atctaaccaa 5160 agtgcaagtt cagctgcctt tgaaagacat ccactgcctg acctggggat gcacgggttc 5220 agagctttgc agggagtgaa catgggctgt ggcttcatga aaatatcacc ctccccaacg 5280 cgtttttgca gatctggact tggaggcacg aaggacggta atcattgggt taccaaggtg 5340 ttactaggag cagaggagaa aaccgcaatt cctagccatg tgtctggtgt gacatttcgc 5400 caacccattt aagtgtgcag acccccaaat atctacctaa agattatgat agtttaggca 5460 ttttacattt aaaattattg gcttcatgtc cactgaagcc tgactggcca gtgtctcaaa 5520 gacacagatg atgatctgat ccctcaggaa cagatggttc tccagctttg ttggagtgac 5580 tttcaaggta tggagcactt atataaattt gcctaagagt aggatttgtg ctaagtacct 5640 gttcacaata acatcaaggt tgttttgatt taagggtagg gcttacataa gcagtagatt 5700 tcaatatata acatagattc ttgaaacccc cccaaaaaac attaaaggaa gtacctatgt 5760 cataatttta attttttatt tagtaattta aaatcttaac gtcttgtttt gttagctaat 5820 cttaagtttc tcactaaaaa ttagcatgat taagcatgaa aataatagct ttaagacagt 5880 ttttacccca gaaccagtga ttggataata gggttccagg ccctcccctt caggtcctgc 5940 gtgacagaat gtgaaccaat tcatagccaa gcgaggagag agtgaaacgt tcctaggtgc 6000 agcccctttc aggcaggact tactccttat gctgaaacct ggccctcact atgagacatt 6060 tgcatttaac cttgtatata agtttatttt tattcataaa ttatatatat gcacatatat 6120 gtatatatac agctggacat ggtgaatctc acctgtaatc ccaacacttt ggtaggctga 6180 ggggcgagga gctcttgaaa ccaggagttc gagaccagcc ttggcaacat agtgtgagca 6240 ccctccgccc cccaaccttt tctacaaaaa aaagaaagaa agaaaaaata gccaggcatg 6300 gtggagcttg tctgtggtcc cagctacttg ggggacttag gtgggagggt catttaagcc 6360 tgggaggtag aggctgcagt gagctgagat caggccactg catgcactcc agtctgagtg 6420 acagagcgag atcctctctc tctatctctt cctcactctg tgtgtgtgtg ttggggagag 6480 gggtgtgtgt ttgtgggtgt gtgtgtgagt gtgtatgtgt gtttattatt caaaatgaaa 6540 acaacataat gacaattatt tttattttta tttttttgag acagagtctc actttgtcaa 6600 ccaggctcca gtgcagtggt gcgatctcgg atcattgcaa cctccgaccc cgagattcaa 6660 gcaattcatc tgcctcagcc tcctgagcag ctgggattac aggtgcccac taccttgcct 6720 ggctaatttt tgtattttta gaagaggcag ggttttgcca tgttctccag gctggttttg 6780 aactcctgag ctcaagtaat ccgcccacct tggcctccca atgtgctggg atcacaggca 6840 tgagctgcca tgctcgacca atgacaatta tttaaaaatt ttagatttta caatctttct 6900 ggccttttgg cttttgaggc agcctgagct gtgaaaatag gcaatcctct attgcagcaa 6960 tgtgcaatag aagtaaaatg tgagccacgt gtgtcaaata aaattttgta gtagcaacat 7020 aaaaaaagaa aaatgagtga aattgatttt aataacaata tatcggaagt attttaacat 7080 atgatcagaa ttaaattatt ttatatatat attttgggaa gcacacaatt cagctcacag 7140 cagcctctgt caggacataa cgtttacatt gagatcccag taacctagca acagagggaa 7200 tgacatacaa agattctggg gaagaagatt ccacacagat gattctattg agcaatggtc 7260 ctgaaatggg agtgaacagg gtgagtttta ggaaccagag accagtgatg gcaaggaaga 7320 ggtatgaaaa gcaaggagag gaggtgaaat cacagagatc cagtgagata tataaacttg 7380 attgtgattt aagcagtttc taccttttgg attttgagaa caatactgtt attaacatga 7440 gtgtgccaat gtttcttgca ggtcctgctt tgaacattta gatagatatc cagaaatgag 7500 attgccagat cgtattagaa ttccattttt agtattctga ggaatatctg tactattttt 7560 cataatggct gcattattat tttttccacc accagtgtac agtgttccaa tttctctaca 7620 tccttgagaa catttgttat tatttcttgt ttgatagtgg ccatcctaat gagtgtgagg 7680 taatatctca ttgggatttt gctttttatt tctctcaaga tttgtagttt tgagcatctt 7740 tcaaattcct cttggccatt tgtatatctg gtttttaaaa acatacgttg atcattttgc 7800 ccatttttaa atagggttat ttactttttg ttgttgagtt ttagaggttg tttatacatt 7860 ctggatatta acgtctatca aatatgttat ctgcaatttt ttctcatttc ttaaatgaca 7920 tttttactcc acttaatgtt ttctttgatg tccagaaagc ttattacact tgatgtagtc 7980 ccatttttct gtttttattc ttgttacttc tgcttttaat gtcatgttca aaaaattacc 8040 aggacaaatg tcacaattgt ttaccctata atttatttta aaagttttag agctatctta 8100 cttacattta agtatttaat tcatttaaga tatttttgta tacagtgcaa gtgaaaagtt 8160 aaatttcatt ttttttaatt ttgatattcg gttttgtaac actatttgct aaagcgtctg 8220 ttcttcccct ttgttcggtc atggcaactt gattgaagat tatttgctga tattcatgaa 8280 gatttatttc tgggttctcc attctgtttc atcatctatt tgtctttctg tttgaatttc 8340 tacagctttg taatatattt tgcaataaag ttgcaatcca actttgttct tccctacagc 8400 tattttggct actcattgtc ccttgagatc ccatatgcat tttaggactt aaaataatat 8460 ttctccaaaa aagaaaactc agcttttgtg ccagacatgc cttgagattc catataaatt 8520 ttaggactta aaataatatt tctccaaaaa agtaacattg agattttgat ataaaatact 8580 tttctttgaa tttgtgcttc aatccaagta gtattgacat cttaacaata ataaaatttc 8640 tgatccttga acaagaggtc aagagtgtgc tgttttaagt ttcatatata ttttgatttg 8700 ccagtttcct tctgcttgtg atttgtagat gagggcttat tatgttgccc aggctggtct 8760 ccaacttttg gcctcaagct attctccgtc atcagcctcc tgatgtattt ggattacata 8820 gataagccac tgcacctggc ctctttattg tttttcttac atttttatga tttgaaggta 8880 atttttgaaa agatttaaaa atatgtatct ccttagaagg ttttcatttt taatgtagtc 8940 aaaaacacgt aaaattgacc atcttaaata ttttaagtac ataattaaat aatattaaat 9000 atatttgcac tgtcatgcaa catatctcta gaatgttttt gctgcaaaac tgaaactcaa 9060 tatctatgaa acaacaacta cccctttatc tcctcccctg aggctctgac tactttctgt 9120 ttctaggagt ttaactactt tagatatctt atttaactgg aatcacacag tgtccttttg 9180 tggctggttt attttattta cataatgtcc tcaagattta tgtttagtgt aaaaataatc 9240 agatctcctg cttttaaaaa actgaataat attccattgt ttgtatattt caaattgtct 9300 ttacctactc aatcactgag ggacgtttgg gctgcttcca cctattagct tttgtgagca 9360 atgctgcaat gcatatggat atacaaataa ttcttcattt ggccatatat atgagatttt 9420 atttctgtgc tctgttcttt tccgttggtc tgtctgtctg cctttatgcc agtaccaaat 9480 ggtttggtta ctgtagcttt gtaatacatt ataaagtcaa ggagtgtgat gcctccaata 9540 gcatttcttt ctttgaagtt tgtttggttc tcgatactca ctttagattc catataagtt 9600 ttagaatttt tttttgtatt tcttcaaaat aatatgacag ttaaaataag atggagatgg 9660 cattaaatct gtagatcact gtgtaatgtg gacatcttca caatattgtc ttccaaccct 9720 tgaataagag catgctcaaa agtatgttgt ttaatttcca catgtttgta gattttgcag 9780 tatttttctg ctatcaattt ctaattttat tcccttttaa taaaaaataa tagtttgtaa 9840 tattttaatc ttattttttg tatgttgtat gttgacaccc taaaccccag tacgtaagag 9900 tgtggctata tttgaagaaa gtgtcatcac acagataatt atgttaaaat gaggtccttg 9960 ggggcacagg tggggtggtg gagagaaggt cacattatac aaaatttcag ttaggaagaa 10020 taagtgcaag agatctattg tacttggtga ctacagttaa tgtattgtgt tcttgactaa 10080 tacagtagat ttcgagtgtt ctcacaacaa aaacatgatg ggtatgtgag gtaatgcata 10140 tgcaaactag cttgggttaa ccattccaca atatgtgtgt atttcaaaac agtaccataa 10200 atgcagacaa ttttgtgtca gttacaatca aaaaagtttt aaaatgagga ccttagggtg 10260 ggtcctaatc caatctaagt gatgtctcca tgaaagagga aataaggata caaatgtgca 10320 cacagagaga aatggccaca tgaggacaca atgagaatgt ggctacttac aagcctagga 10380 gagaggcctc cgagaaaaca caccctaccc acaccttgat gttggacttc atcctgtaga 10440 cgaagtcctc caccctcctt catcaggtgg aagcctttga ttctgaatat tctccaaatg 10500 ctggaaggta caaaagtgaa gagacagcac agacctcagg gtgaaaagtt taaagagaat 10560 aacatctttc cattgctgtg tcctatcccc tacacacacc tattccagtc tttattggtc 10620 ttttgtgttt tcgtgtcctg ggtatagtgt tagttgtaaa tctgtgttta catacaggat 10680 aacataaaac aaaggtaaac aataaaataa aaacagacag caaaactcaa ctaatagtgt 10740 ttgggcatgg tgacagtgaa gacaggagag tcacataaaa atggaggtgg aacttttgag 10800 ctaaatcaat gtcctgttgt gtttctgtgt ttctacatca gactctatag tggcaattgt 10860 caggttaggt gtttttatcc ttgcctgctc agtaagtgcc agaggagatt tttctaaact 10920 gggtgaggaa caggtagaga agtgtaagtg agacaaactt ccctgccatt tgccaaagtg 10980 gcagcacata atgtatcatg agtgcctcta ccctctgata tccaaaatat caactctata 11040 gatgatgttc tgttggctcc taatttagaa gcatctgcct tcatactctt tcctagaatg 11100 gtaacatctc tctgtcagta gctgataaaa ttactaaaat ctcatagttc tgttttccag 11160 ttaaggttca ctgcaactat atgggcagat ttacatagct caattcttcc agcttgacat 11220 tgttttcttt tgaagcttct gagagaggga gtcagcctcc acctagaggt ggcccttgga 11280 gtttttgaca cagaatctcc ctgacactac tacttacact gatttgaaag tcagtgttga 11340 ggttggcttg ctacaatgct cttgtcaaac tgaatcctgc cacatcaagg gcttggggct 11400 tcccagctta attttccaat tttgaagtga gagaatttga gttctacaag acatcagaag 11460 accgcttaga atataacaca ttctaaaagt aaatcggaat gccacaggaa catcttagtt 11520 gtaaaagaaa aattaggttc tttggtaaaa attttatgcc ccttgatatg gttttgctgt 11580 gtccccaccc aaatgtcatc tcaagtttta gctcccataa ttcccaagtg ttgtgggagg 11640 gacccagtag gagataattg aatcatggga gtgggtttcc ccatactgtt ttcgtggtag 11700 tgaatgagtc tcatgagatc tgatggtttc ataaggagaa acccctttgc tttactctca 11760 ttctctcttt tcttgtttgc caccaagtga gacatggctt ttaccttctg cgatgattgt 11820 gaggccccct cagccatgtg gaactgtaag tccattaaac ctctttcttt tgtaaatttc 11880 tcaatcttga gtatgtcttt atcagcagtg tgaaaatgaa tgaatacacc tcccttcact 11940 gtttgaagaa aaactgctgg ctctatgagg tcttcagttc actgaatatt ttttaaatgc 12000 ctagccctaa ctcactgaca agtcaaggaa gctgccacct tatggtgttc actaggctac 12060 tgtggatagc cctcattgcc aggcacacag aacctgaagc aggggtggct gctccactta 12120 atggtgagag ctcagggttt tgggtcattg tacatgttct aactgtttgg tcttccacat 12180 tgaaattaaa ggctattaga ttaaggacat cctttttcag attcaaatca tgcagaattt 12240 cagctgctga tctgtaaggt cattcattta agagtccatg gtaagggttg gtcctccgaa 12300 aatgctaacc acagttaagc caaaaaccaa gcctgaaccc atgtgtaatg cagtggtcat 12360 cactgcatgc cagatttgtt ctccttggtg aaagttgacc tatttgtctc atggtttaga 12420 aggccccaga ccattcctgg cataataact gcattggcct ttggcccact tcctgaagtg 12480 ttggtcatgt ctcaccacta ttaacacttt taggataata atcagctgac tccagccaaa 12540 gcatgtgtcc cttgaagctc attcatgtaa tattttaggc ttttgagacc aattgcactt 12600 aaatcacatt gtaactttta tcactaaatc tgttaacatg gccaaactgt ttgattttac 12660 ttcctttttt cctttaggtt cacacatgtt ggctacgtaa tttgttgatt gaatgtgact 12720 gtccccccag cacccaaaaa tctcttcttg actgactcca agatgaagag aacatcattg 12780 agttctatag gcatccattt tcaaaatttg atatttctca ctgacctttc ctggacagga 12840 tgtgtctttc ttttcctaag ctgcaacccc aggcaaatct tgtttgtaca ttctctgaat 12900 ggcagtccaa cccaaagtgg ggttgtggtt ttcaacttag ttgtggtgaa catttacata 12960 gttcttgcat ccatatcatc tggctctact agaaaaaaaa attactagaa gacagtatgg 13020 gtgactagaa tagaaaaaat gactacttaa aattataggt actggcatag cacacataaa 13080 ttctgtggct gtagaggaaa gacaaaaacc caaaacctaa aagtgaacaa attagaccca 13140 ggaactacag aaaaggaggg aagttaatat ctttctgtct ttctgaatta tccctagtgt 13200 ggcccagaga agagttgggg agtcctgttg ggccagaatg agtagtagta ttagttagac 13260 caaatggtag aacaggatgg gtgggttcag tttctctaag gaatttcatc cttttgtgat 13320 gtagacatgg aatcccagtg agcatcctgc actctcagtg tccaactgtc tgaaaccaca 13380 gttgtttctg aagactgaga acaattgctt tttgctgagg agacagcact ccacagctga 13440 ggtaaggtgg attctataga gcattaattc cattttctcc ttcttctgtg agctgggttt 13500 ctttcttcag ttttttatct agagatgaaa ttacattgtc aatatttgtg taaaatagag 13560 atataagtct ggcaagtaga cgcttagatg caaccctcag cagaaattct ggtctgtttt 13620 tcttctcact tttattcaat ttttgtcaca gaaagatgcc ctgacatgtg gtctctcaag 13680 agtgattaga acattgtact tctagaggag ttagttgatg gaacatggca gagcttagaa 13740 tgaggatgga agcctctgtc caaatctccc aggctatcta tgtgtgggaa gtggagtcag 13800 gaaagaccct agagctttgg aagtgatcat taagagaaaa caaaatccct gtaaattaga 13860 gtacaaggga acatattcat tagccacctt tagagtcaga tgctcagggc tgagctggtg 13920 tgggggagtg ttcagacctg tgcatgcctg ggaaagcctc cgatcatgta tgaatggtga 13980 ggcttctggg tgcatgagat tgagtgccct tccttggcag agcaccactg agtgaacaat 14040 tgatttagga tcaagcatat ggagatcaac tttattttga atatctcaaa gacagaaact 14100 gaaaagattt ttttgcactt tgaaattgag taagggtgtc agaaacttca gtaaaaaagt 14160 cacaggagga aacccagaag tcttattcat ccccagaaac caccaaaatc ctgatccaaa 14220 ctgtaagaat tcacctcgta aaaatttgat ttttgaatag gaacagctcc ggtctacagc 14280 tcccagtgtg agcgatgcag aagacggttg atttctgcat ttccatctga ggtaccaggt 14340 tcatctcact atggagtgcc agacagtggg cgcacgtcag tgggtgtgcg caccgtgcgc 14400 gagtgaagca cggtgaggca ttgcctcact cacgaagtgc aagaggtcag ggagttccct 14460 ttcctattca aagaaaggca tgacagatgg cacctggaaa atcggatcac tcccacccga 14520 atactgcgct tttccgacgg gcttaaaaaa tggtgcacca ggagattata tcctgcacat 14580 ggcttatagg gtcctacgcc cacaaagtct cactgattgc tagcacagca gtctgagatc 14640 aaactgcaag gtggcagcga ggctggggga ggggcgcccg ccattgccca ggcttgctta 14700 tgtaaacaaa gcagccggga agctcgaact ggggtggagc ccaccatagc tccaggaggc 14760 ctgcctctgt aggctccacc tctaggggca gggcacagac aaacaaaaag acagcagtaa 14820 cctctgcaga cttaaatgtc cctgtctgac agctttgaag agagaagtgg ttctcccagc 14880 aagcagctgg agatctgaga acaggcagac tgcctcctca agtgggtccc tgacccctga 14940 ctcccgagca gcctaactgg gaggcacccc cgagcagggg cagcctaaca cttcacacag 15000 ctgggtactc caacagacct gcagctgagg gtcctgtctg ttagaaggaa aactaacaaa 15060 cagaaaggac atccacaaca aaaacccatc tgtacatcac catcatcaaa gaccaaaagt 15120 agataaaacc acaaagatgg ggaaaaaaca gagcagaaaa actggaaact ctaaacagca 15180 gagtgcctct cctcctccaa aggaacgcag ttcctcacca gcaatggaac aaagctggac 15240 agagaatgac tttgacaagc tgagagaaga aggcttcagg tgatcaaatt actccaagct 15300 acgggaggat attcaaacca aaggcaaaga agttgaaaac tttgaaaaaa atttcgaaga 15360 atgtataact agaataacca atacagagaa gtgcttaaag gagctgatgg agctgaaaac 15420 caaggctaga gaactacgtg aagaatgcag aagcctcagg agccgatgcg atcaaccgga 15480 agaaagggta tcagcgatgg aagatgaaat gaatgaaatg aagcgagaag ggaagtttag 15540 agaaaacaga ataaaaagaa atgagcaaag cctccaagaa atatgggact atgtgaaaag 15600 agcaaatcta cgtctgattg atgtacccga aagtgacggg gagaatggaa ccaagttgga 15660 aaacactctg caggatatta tccaggagaa cttccccaat ctagcaaggc aggccaacat 15720 tcagattcag gaaatacaga gaatgccaca aagatactcc ttgagaagag caactccaag 15780 acacataatt gtcagattca ccaaagtgga aatgaaggaa aaaatgttaa gggcagccag 15840 agagaaaagc caggttacac tcaaagggaa gcccatcaga ctaacagcag atctctcggc 15900 agaaactcta caagccagaa gagagtgggg gccaatattc aacattatta aagaaaagaa 15960 ttttcagccc agaatttcat atccagccaa actaagcttc ataagtgaag gagaaataaa 16020 atactttaca gacaagcaaa ttctgagaga ttttgtcacc accaggcctg ccctaaaaga 16080 gctcctgaag gaagcgctaa acatggaaag gaacaaccat taccagccac tgcaaaatca 16140 tgccaaaatg taaagacgat cgagactagg aagaaactgc atcaactaat gagcaaaata 16200 accagctaac atcaaaatga caggatcaaa ttcacacata acaatattaa ctttaaatgt 16260 aaatggacta aatgctccaa ttaaaagaca cagactggca aattggataa agagtcaaga 16320 cccatcagtg tgctgtattc aggaaaccca tctcacgtgc agagacacac ataggctcaa 16380 aataaaagga tggaggaaga tctaccaagc aaatggaaaa caaaaaaagg caggggttgc 16440 aatcctagtc tctgataaaa cagactttaa accaacaaag atcaaaagag acaaagaagg 16500 ccattacata atggtaaagg gatcaattca acaagaagag ctaactatcc taaatatata 16560 tgcacccaat acaggagcac ccagattcat aaagcaaatt cttagtgacc tacaaagaga 16620 cttagactcc cacacaataa taatgggaga ctttaacacc ccactgtcaa cattagacag 16680 atcaacgaga cagaaagtta acaaggatac ccagaaattg aactcagctc tgcaccaagc 16740 agacctaata gacatctaca gaactctcca ccccaaatca acagaatata catttttttc 16800 agcaccacac cacacctatt ccaaaattga ccacatactt ggaagtaaag ctctcctcag 16860 caaatgtaaa agaacagaaa ttataacaac ctgtctctca gaccacagtg caatcaaact 16920 agaactcagg attaagaatc tcactcgaaa ccgctcaact acatggaaac tgaacaacct 16980 gctcctgaat gactactgcg tacaaaacga aatgaaggca gaaataaaga tgttctttga 17040 aaccaacgag aacaaagaca caacatacca gaatctctgg gacgcattca aacctgtgtg 17100 tagagggaaa tttatagcac taaatgccca caagagaaac aggaaagatc caagattgac 17160 accctaacat tacaattaaa agaagtagaa aagcaagagc aaacacattc aaaagctagc 17220 agaaggcaag aaataactaa aatcagagca gaactgaagg aaatagagac acaaaaaacc 17280 cttcaaaaaa ttaacgaatc caggagctgg ttttttgaaa ggatcaacaa aattgataga 17340 ctgctagcaa aactattaaa taagaaaaga gagaagaatc aaataaacgc aataaaaaat 17400 gataaaggtg atatcaccac tgatcccaca gaaatacaat ctaccatcag agaatactac 17460 aaacacctct acacaaataa actagaaaat ctagaagaaa tggataaatt cctcgacata 17520 tacactctcc caagactaaa ccaggaagaa gttgaatctc tgaatagacc aataactgga 17580 gctgaaattg tggcaataat caatagcttc aaccaaaaag agtccaggac cagatggatt 17640 cacagccgaa ttctaccaga ggtacaagga ggaaatggta ccattccttc tgaaactatt 17700 ccaatcaata gaaaaagagg gaatccttcc taactcattt tatgaggcca gcatcatcct 17760 gataccaaaa ctgggcagag acacaacaaa aaaagagaat tttagaccaa tatccttgat 17820 gaacattgat gcacaaatcc tcaataaaat actggcaaac cgaatccagc agcacatcag 17880 aaagattatc caccattagc aagtgggctt cacccatggg atgcaaggca ggttcaatat 17940 atgcaaatca ataaatgtaa tccagcatat aaacagaacc aaagacaaaa accacatcct 18000 tatctcaata gatgcagaaa aggcctttga caaaattcaa caacccttca tgctaaaaac 18060 tctcaataaa ttaggtattg gtgggatgta tctcaaaata ataagagcta tctatgacaa 18120 acccacagcc aatatcttac tgaatgggca aaaattggaa gcattccctt tgaaaacggg 18180 cacaagacag ggatgccctc tctcaccact cctattcaac atagtgttgg aagttctggc 18240 cagggcaatt aggcaggaga aggaaataaa gggtattcaa ttaggaaaag aggaagtcaa 18300 attgtccctg tttgcagacg acatgattgt atatctagaa aaccccattg tctcagccca 18360 aaatctcctt aagcttataa gcaacttcag caaagtctca ggatacaaaa tcaatgtaca 18420 aaaatcacaa gcattcttag acaccaataa cagacaaaca gagagccaaa tcatgagtga 18480 actcccattc acaattgctt caaagagaat caaataccta ggaatccaac ttacaaggga 18540 tgggaaggac ctcttcaagg agaactacaa accactgctc aaggaaataa tagaggttaa 18600 atggaagaac attccatgct catgggtagg aagaatcaat atcttgaaaa tggccatact 18660 gcgcaaggta atttacagat tcaatgccat ccccatcaag ctaccaatga cttcttcaca 18720 gaattggaaa aaactacttt aaagtgcata tggaaccaaa aaagagcccg catcgccaag 18780 tcaatcctaa gccaaaagaa caaagctgga ggcatcatgc tacctgactt caaactatac 18840 cacaaggcta cagtaaccaa aacagcatgg tactggtacc aaaacagaga tatagatcaa 18900 tggaacagaa cagagccctc agaaataacg ctgcatatct acaactatgt gatctttgac 18960 aaacctgaga aaaacaagca atggggaaag gattccctat ttaataaatg gtgctgggaa 19020 aactggctag ccatatgtag aaagctgaaa ctggatccct tcctcacacc ttatacaaaa 19080 attaattcaa gatgaattaa agacttaaac gttagaccta aaaccataaa aaccctagaa 19140 gaaaacctag gcattaccat tcaggacata gacatggaca aggacttcat gtctaaaaca 19200 ccaaaaacaa tggcaacgaa agccaaaatt gacaaatgag atctaattaa actaaagagc 19260 ttctgcacag caaaagaaac taccatcaga gtgaacaggc aacctacaaa atgggagaaa 19320 attttcgcaa cctactcatc tgacaaaggg ctaatatcca gaatctacaa tgaacgcaaa 19380 caaatttaga agaaaaaaac gaacaacccc atcaaaaagt gggcgaagga tatgaacaga 19440 cacttctcaa aagaagacat ttatgcagcc aaaaaacaca tgaaaaaatg ctcaccatca 19500 ctggccatca gaggaatgca aatcaaaacc acaatgagat accatatgac accagttaga 19560 atggcaatca ttaaaaagtc aggaaacaac aggtgctgga gaggatgtgg agaaatagga 19620 acatttttac actgttggtg ggactgtaaa ctagttcaac cattgtggaa gtcagtgtgg 19680 cgattcctca gggatctaga actagaaata ccatttgacc cagccatccc attactgggt 19740 atatacccaa aggactataa atcatgctgc tataaagaca catgcacacg tatgtttatt 19800 gcggcattat tcacaatagc aaagacttgg aaccaaccca aatgtccaac aatgatagac 19860 tggattaaga aaatgtggca catatacacc atggaatact atgcagccat aaaaaatgat 19920 gagttcatgt cctttgtagg gacatggatg aaattggaaa tcatcattct cagtaaacta 19980 tcgcaagaac aaaaaaccaa acactgcata ttctcactca taggtgggaa ttgaacaatg 20040 agaacacatg gacacaggaa ggggaacatc acactctggg gactgttgtg gggtgggggg 20100 aggggggagg gatagcattg ggagatatac ctaatgctag atgacgagtt agtgggtgca 20160 gcgcaccagc atggcacatg tatacatatg taactaacct gcacattgtg cacatgtacc 20220 ctaaaactta aagtataata ataataaata aataaattta taaaaagaaa attgttgttt 20280 aaaataagta aaaaaaattg atttttttca ccatatagat ttatctttca tttgaccttt 20340 atttaattac aagttttagt taatacattt tattttacct ttatgataga aatatcagat 20400 tcttaaactc aaagcattaa tgatgcctac ccaagtgtag tttttattgc caattaactt 20460 tctactatga tgcaaatttg tgtgtccttc taaaaactca tttgtaaaaa tttagtccct 20520 gatgtgatag ttttaaaaca tgtagccttt ttggaagtga ctaactcagg agggcttcat 20580 cctcatgaat gtaattaata ccctgtaata gaggttgaag ggagcaccct tgtcccttct 20640 gccattgaag acacagcaac aaggcatcat ttatgagaaa tgggaccctc cccagacact 20700 aaatttgctg gtgctttgat cttgaacttt ccagcttcca gaactgtgac caacgcattt 20760 ctgttattta tacatgaccc agtctaatgt attttgtttt agcaatctga acaaatgaag 20820 acactttctg atgcactgtg gtttattttt gaatttatag ttccactgag ctatctatat 20880 attcaaaaat caacatgtct cacagggtgg gacagccact ctaattattt tttcagagtt 20940 ttcttagctg ttcttatttg tgttttcatc tataggagtt ctccaataca atgcctgctt 21000 ccactaccaa atggtatcta tattgggaca aaattaaatt tattaattac tgctaaaaaa 21060 attgatgatt gagtaataat gagtttttct gctttagaac atgatgtgat tttctatttg 21120 tttatgctga ctttcctata tttcgaagac tttttatgtt ctttgtcata cattttacaa 21180 attcttgtta gatctatgta tagctagttc attttatttt gtcctgttga aaagtaaact 21240 gtagcacaat acaaattata caagttttct tggcagggaa tgatgatgca tatctgtggt 21300 tccagcactt tgagaggcca aagtgagagg actacttgac aagccaaagt gagaggggtt 21360 gaggccagcc tgggaaacat agtaagaccc tggctctaca aaacagaaga agaaattagc 21420 tgggtacagt ggtgcatgcc tgtatttcca gctactcatg aggctaaggc aggaggatgg 21480 cttaagctca ggatgtaagg ctgcagagag ctttgattct atctatgcac tctagcacgg 21540 gcgacagagc aagaacctgt ttcaaaaaac atttttttct tgagtaacaa gcaactaatg 21600 aattgtggaa caccggacca aaagaggttt accattttag tggcaaagtg tcagaggtat 21660 tgaagaaatg cgggagcaaa atattaaaat tatttgattc attgcaggta taaaattgtc 21720 ttatttggtt taccttatag acatatatta ctataagctt tttgagtatt tctgataact 21780 taagcttaaa ttctgttgtt gttgtttttc taatacaggc attcacaaaa aatagctcgc 21840 attatgtttt gcttctttgc aaatcaacaa gatgatgtca ctggggaggt ctaactgatt 21900 ctgtctgctc aggaatgttt ccaaggcttg gtctcctttt taatttactt tataaatcat 21960 tttgtaattt ttatctccca cacattctgt atcttgttaa ttacttttat acatatgtaa 22020 atatatttta tgtatgattt ctaagtatat cactcatgaa aattgactgt gacacaggca 22080 taccttatct tatagcacct tattgtgctt cacagatgtt ggatttctta caaattgaag 22140 ggttttggca accctatatt gaatgactct attaatacta tttttccaac atcatgtcat 22200 ctttgtgtgt gtgtatacct ctgtcagcat tttttagcaa taaagtacat tttttattaa 22260 ggtatgtata tttttaatat aaacatgggc tatttcaatt ttattctaca gtatagtgta 22320 aacataactt ttatatgttc tgggaagcaa gaaaaattgt gtgactcaca ttattgcaat 22380 atttgtttta ttgcactggt ctagagcaaa acccatatat ctcaaaggta catctatata 22440 tttttattga attggcccca ttaactgaaa cgataaattg ctattctttg actaaacaag 22500 agctgtggag tgtggagagg tcagtgtgaa aatgaagtag aagtgaatat aaacagtctt 22560 ctaactcaca caggtagtaa tacaaattaa ttatggatga aatataaaat tatctaaaat 22620 tgaatattct cagaaacatt aatgtttata tcattatgta tatggacatt aagaacataa 22680 taaaaataaa tttagatgat gaaggctttc ataatctcaa tgtaaaatgc aagaataaaa 22740 aatataaaat ttatcaatgt ataaaataaa tgacaatatt tatattatat ttaatatcct 22800 gcaatacagt acaatcaact gtaatgtata aaatagaaga aagtttaaat atggaaaatt 22860 taaaaggagg caaatatgag tcaggtaaag aataaataag tatgatcaat tatatttaag 22920 atacactgac ctgagtttta tggaaaaata ttaatggtaa acaccttgtg gagtgatttc 22980 accataattt tattaaactg taaaaatata ttaacatttt cccacaggga gtacaattaa 23040 agatttgtgg atattagtcc tccatgggct cagcgtggga aagttagagg ctacagcctt 23100 ttctgaatta aaagagaaac atataacttt ttatttattt ttctacaatt taaaatttgc 23160 aaaggcatat gaatgattac atcctaatat ttgtctgatt atatagaaat gcatgactgt 23220 caccagacat ctgaaagaca tcaaatgtct aacaagaaac ataaaatttg tttataatct 23280 tagccctctg tgaaatgcag ggttcaccat tttgagtata ttgttcaagc tactttccta 23340 tagcaggttc ttgctctcat tccaagacct gcaatttctg ctgaatgggg tcagggtgca 23400 cccagctcaa gcctcatctg acccactgac aggctcagtt atctcctgcc caggcaaggg 23460 atgggcttct ctatccaggg ctggatcccc agggccaggc aatgtggctg aaacaagcca 23520 gttcttagca gggaagacat aacctacctg ggtggctata aaatagaagg tctgcacctg 23580 agcacataga ggccgccaga accgggcaag ctgagtgagc tgctcccagg tcagtggaga 23640 atggacctgc tgcaccgata cccagtatag gtcttgataa atggccttga cacagcttgt 23700 aaagtcacca agcttttctg aaatgacagc cattgaactc ctagggtctg agacctgtgc 23760 tgcttggtgc acccagtgtg agtcatgaaa ggccctctgt ggtgggcatc acaggtctcc 23820 ttgagtttat tgctgtgcaa agtggaggac tttagtttct ttttcaacat caagctgtgc 23880 tcctctcctg gacagatccc cgcaaaagaa gcatgtgagt gagatactcg ccagcacagt 23940 tcccaccgac cctcatttcc aaaacctccc atgcaccttc aggtgaacat ttgaatcttc 24000 ccctccatcc tgaccatatt agtacttaaa tgaagtgaaa tttaacacct tctgagtccc 24060 caaatattct ctgagtgcca ggatctcaaa aattttttca gtcaccttgc tgctcaaacc 24120 ctctataaaa gtcaactgct tattttccct ttgggggaaa aaggcaaaat tctaccagct 24180 ctgtcttggc agctgtcctt ggaaccgatt ttccttttct tggagtttcc ctcatgtgag 24240 ctcgactctg gttctgttga taaaataaga gtttgagtaa gtgtctccaa ccaaacaccc 24300 tagaagcctt agttcatcct ggacacaaag gagctgaagt agctatcaaa cccagctctc 24360 ctctgttctc cagaatccat gtctatatgg ccctggctgc caaagagctc ccagtttcct 24420 tgccagggga gactgtgttg cagccctttc tctttttacc ttgaaagagt caaattttac 24480 ctaatctagc agtgctgttt ctagctttgg gcttagtttt ctcagaattc ttctcttcat 24540 ttagattggg ctctgatcct agtgcaacat ggaatttagg tgactcacct ctctcagaca 24600 cagagtctca tagtctatct ctgacaaata tttgtggatc agtcctttaa gtgaagctct 24660 tctgccagtg tcataagtga agatgtttct caaagtctcc ccagggctct aagccatctt 24720 ccatccccaa tttcaaaata aaaccctacc cagagacaca cagctcagta tccctgattc 24780 caacactcct tccagcctcc ataggaacag cccaaggcat taccctgtct ttgcctgtgc 24840 ttctcactgg aatgggagga gggggtctcg gctttttgtt tgaattgtct cttcttatct 24900 gagccctttt ctgtaaagga gatctgttgg aaagaaggct ggtcagtggg gcattggatg 24960 gaggagcagt ggagaattag ggtattcatt tccctttccc ttgtttaagc tcaagtgaaa 25020 ggtgctctct cttacacatc cagattcaga ctgtgtgtta ttctgctgga cctgcctttt 25080 acatgtcctt ggctggacgt ggcatactcc tattgactga acagtttcct ttttcttgcc 25140 actcattagg gcctcataag aaaagtttct tgctgtagtt ttactaggga cctagacaca 25200 gttaaagagg gacattttct gggtcttgtc atagtgtaaa aaaaccctaa acaaacaaaa 25260 aacaaagcca ggggcacagg aatcagaaaa taggggaatc atttttctaa tttctgtccc 25320 aattccacct ggaagtattt atgatactgt ttatttttca acatgcagaa ttaaacatac 25380 ctatattgaa atgtgtcata catttggcaa aggaagagaa ttacacatag tgttaaaatc 25440 atgtacatag atattataat ttttcaaatg cttggaaatg tcaaattaaa attatggttg 25500 attgtattaa atagatacat atatgataac ataaaaatat gaagaaaaag taaataccaa 25560 ataaaatggc tcaaataatt agagattaga caattaatta gacaataatt agatcaaata 25620 caatcaactc gaatttattt aattagagca gatgctaact taatcagcgc ctgattcctg 25680 aggtagcaaa aagtctaggt ggagagagaa acttacccct tttcttaccc ttcctcggtc 25740 atcctgggag ctccactttc ctctgtagaa tttattcagc ctccttagta aacatggact 25800 tggtcccaaa caggtaaccc aactgaccac aagaaaagca gcctagatcc tgagcattca 25860 gctcctgtct tcacacaaca gacaccacct cagtcccatc aaagcctgtg aagtttccct 25920 acatccacca ttgagacata ttccagagca gcctctcaaa attgccttaa caggatggga 25980 cacgatatgg tggagctcct ggctcaggac agctgcctca ccccttccta ctgagaagtc 26040 tgtatctgct ggttagagct atcaaactgt agaaggctta gtgcctgtcc cagcaagtgt 26100 cccctcaaaa gccttcttgt tttctttcct tctgagaaaa gcatacaaga atgagacctt 26160 ctatgttaga gagaactcag cctccactct caattgactt ggttgactga tgaattgatg 26220 ccctgaggag gggatagatt cagggaagag actgtgctga atgagtctgt gttttcctag 26280 ctttgctgtc tgtgcaaata gtggaaccca gaaaaatatc gggtggtaga cacacagaca 26340 ctctaattgt ctgaatttaa atattattta aatggaactt atagtatcat tatatattga 26400 taccataata tcacataaaa tttgtttgat atataaacaa atattgatat tttatcataa 26460 tatcataaag cagttgtgca cacaatagca gataatattc tccggttcta taaagtttat 26520 atgttaatgt tcttacaaaa tttaactcag ttattttata aaattaccta acaaaatttt 26580 gttactgtgc tatcataata atacataaaa ctatgaaatc ataatatatt gtaatataat 26640 acatatgaat tatgatgtca taatgtatta tatcacaata catatgaatt atgccataat 26700 atattatgtc acaatacata caaattatgt cataatatag tgtgacatca tgatgcatgt 26760 gaattatgat gtcataacat actgtgatgc cacaatacat acaaattata tcataatata 26820 atgtgatgtc ataatatatt catttattat atttatgatt ttataataac atgaaatctt 26880 gtcagttaat tttataagaa aattgagtta aattttgtaa caacattaac atatgtagaa 26940 gcttacaatc atggtgaaag atgaaagatg agcaggcatc tcacatggta ggagtgggaa 27000 caggaaagtt gggataagga tacgcctcat ttttaaacca ccggatctca tgaatactca 27060 ccatgacaag aacagcacag agccatgagg aatccatccc catgattcaa acacctccca 27120 ccaggcccca cttgtaacat tagggattaa aatacaatat gagatttgga acaaatatct 27180 aaactatatc gtatgaccat tggaaaaaca gatgaaacat tcatggtttc tactgtccag 27240 atactttcat tccagagcaa atggctaaat gattgcattc aacattctga ggtcagaaga 27300 gagaagggag gtgtacaggg gactttggct gcatttgttc cacttcccat atgctgttgt 27360 tgtgagttct aacattatca ccagaaaggc tattcatgga cagaagaatt attgctattg 27420 ttgtgattat ttctatttct tttacattag taaaaataat ttttttagct tctcatataa 27480 ttttcctaaa aaagccctaa gagttttcgt taaattcctt gttattgtgt gtcataaaaa 27540 ttgacaggga aatggctaaa atagattaaa attacacaaa ctctaggagt caattctatc 27600 aggcaggctt aggaaagaca gaactggaaa tactccacca gcagaacacg agatgcatag 27660 ggcccacttt ctgtccctgc tgtccccagg tccaccctct tctaaggcct cctccaggtc 27720 tggcttcacc ctagaatctc ctctcacaga actaattaaa ggagatcaga aatttgagtg 27780 gtgactcctg ctgcctctcc tgcgctggtg cccccaattt cctgcaaata aaaagcagat 27840 aaatgggagc aaatagttat ctatttgtgg gccacaattt ctttttcatt gaagccatag 27900 agacatccca tctagcaacc tgttttttat ttttgcatat ccagtagttg ctctacaagg 27960 cacaagaaag tcaatataaa taccaaaaaa tccctctgga acagttcatc ctttctttct 28020 gtatcccttt catctgtcta tacagctttt attccataca tttttctttt taaagtgaat 28080 aaatttaaaa agtgaaagaa aaaatagaaa tgctgggccc ttcttctaaa tcctagaaat 28140 tatggaacac agcacccact ttccagggtg tgatgaggat tcactcacat catgtaaagt 28200 ttccagcaca gtggtctgta acagacttct gaacacatag tacatgctta ataagcattg 28260 tattaactca tgtgtacatg tttttttttt ttaatccaga cttactcaaa tgttgatgcc 28320 ttctctagcc tctgtaatct ttaaagaact ggcaaaaaat gacatgtttg taaatggtga 28380 ttggtggtgt ccacgttgag ccaaaactct ctgtgctgtg gtaagagtgt tgggtatcag 28440 gaattctgtg tgctgtgcct actttctcta gatgattgct actaccacgg atgtagtggg 28500 agaaacatca gcattaagag gagaattttt aagagaagct atttcacgga cccctttcca 28560 aaactgcaaa atcacatcaa ctaagaaaaa ggcctggaat cagaaatgat ttatatgtgc 28620 attgaaaatt gaggggcaat acttctctag gtggctctca gaataggctt tcaactataa 28680 tcacatgacc cctttaaaga cctacccata acctgtaccc tccaccacaa gatcctgtct 28740 gttgatcttg ggtgggagca tccatatggt tttaattagg aagtcaaatt gtccctcttt 28800 gatgattaca taatattata tctagaaaaa tctaaagacc accaaaaacc ttttagattg 28860 gataaatgaa atttaataat gtttcaggat tattaaaaat caatgtagaa aaattagtag 28920 catttttata cactaataat gatcaagctg agaaccaaat taaaaagtca attcctttta 28980 caatagctac aaaagtacct ggaaatacaa ttaatcaaac aggtgaaaga tatctacaag 29040 gaaaactaca aaacattgat gaaaaaaatt gtacataata caaacaaatg agaaaaacat 29100 ccctgctaat ggattggaag aattatcatt aaaatgacca tattgccctc caaaaatcta 29160 catattaaat gcaattccta ccaaaatgcc aatgttattt ttcatagaat tagcaaaaga 29220 attaaattta tttgtaacca tagcaaagtc tgaatagcca aagcacattt aagcaaagag 29280 aacaaagttg gagatattac actacctgac ttaaaattat actagaaagc ttgaataacc 29340 aacacaacat ggtactgata caaatagaca cccagatcaa tgtaagagaa tagagaacct 29400 agaaataaag ccacgtactg atcatttaca aaatcaataa aagcatacat ggaaaaatga 29460 cattctattc aacatgttgt gcttgaaaaa ttatattacc acatgcagaa gtatgaaatg 29520 gaacccgtga ctctcaccat gtaaaaaaat caactcaaca tggattaaaa gactaaaatg 29580 ttagacctga aatgataaaa attctagaag aaacccaatg ataaatgctt ctggacattg 29640 gcctgggcaa ataattcatg accaagatct caaaagcaga tgtagcaata acaaaaatag 29700 acaaatggaa cttaattaaa ctaaaaagtt cctgaaaagg agcttttaat tagcaggtga 29760 gcagacaacc catgaaatga gaaaaatgtt tgtgaactat acatgtgaca aagaactaat 29820 gtctacaata tacaaggaaa tcaaacaaca agaataaaac aagtaacctc actataaagc 29880 aggcaaagaa tgagaacaga tatttttcaa atagaagaca atgattgcca agaagcatgt 29940 aaaaaatgtt aaacattgcc aatgatcaga gaaattccaa ttaaaaatcg caatgaagta 30000 ccattttata ccattcataa tggctaatta ttaaaaagca gaaaaatgac agatactggc 30060 aaggatacag agaaaagaga acacttattc attgttcgag ggagtgtaaa tttctacaac 30120 ctctatggca aacagtataa agatttttca gaaaaactaa aaatggaatt tccatttgat 30180 cctgcacttc ctctactggg tatctacccg aaggaaaata attcattaca taaagaagat 30240 acccacactc atatgtttat tgcagcacta ttcacaatag caacgatatg gagtcagttt 30300 aaatttatca gtcaatgatt ggataaagaa aatgtactat acatttattc catggaaaac 30360 tactcagcca taaagaataa aatcatgtct tttgcagcaa catgaatgca actggaggcc 30420 attattgtaa gtgaaataat tcagaaacag aacataaagc actacatttt cttacttaca 30480 agtgggagct caataatgca tacacttgga catagagatt ggaaaaatag acactggaga 30540 ctcacaaata tgggaggttg gtagaggggt taggaatgag aaaataccta actgggaaaa 30600 tgagcaccgt tcagatgatt gttgcacaga agcccacact tcatccctat gcaacatgtc 30660 cctgtaatga agctacattt ataccctcta gtgtattaat aaagagaaaa aaactgactt 30720 tttttcagtt ttgacaagag ggcaactgac aagagggcag actgaataga ctttataatt 30780 ttcataaata tttatattag aaaaataact ttaataaaaa taaaaaataa tattgtattt 30840 ttaagaatgg tataaaaaga taatttgatg aattggaata gttaggactt agcacataca 30900 atatgttaag caagattcta agccattaga catttgtaga cagaatatct aacagtataa 30960 aataaataac acaaatattc attggcaatg acaaattgac atattttcaa tcctattaga 31020 tgatattaaa accattacaa aatttactgt tttgtttcat aataaaaatc gtaatgttaa 31080 ataatttcat ttaaaagttt gactaattag gcatatagag aaatgggcag tatgttgacc 31140 agtaaacagg atacaaatta tttttcgaaa agcaaacaac atttttatat tttatttatt 31200 tatttatttt tctggtcgga ggagctgctt tattgtctga ggacatagta cggtcctctc 31260 cctgggaggt ggggtcttcc accggtcacc cgaggcagtg ttccaggagg ctccatgcaa 31320 ctcactgctg ggctcagctg ggggctgggc cttggagaag gcgaactgtg cagggaagca 31380 gtagctgtgg gtcctcaccg cccgctctgc tttgctgcac tgggtccctg gtgctcctcg 31440 aagtcccact gagcctgcgt ctctatagga cggtgaccat ccagcccaga atcttctagt 31500 cagagcacag tttgaccagg ccaggcattt ccgcttcctc tccttgggct ggactttgca 31560 cttgggtttc ttccagtcct tcttctgccg cctgtctgcc agagcttaaa ttccagcctc 31620 acaaatgttc cagctagaaa aggcgtgtcc gcggcgcctc gcacactggt ctcttggaag 31680 gcacgggcgg gtggattcct ccagggccac ctgcaggtcc gggtgctggg ccctgtgagc 31740 tcgaccccac ccctgccccc cgagcccatc cactgagcca acggtatcct cagccttcac 31800 ttgcttccac catcacaccg ccctacaaag cgggtgtgcg ctcctcagct ctccctgcat 31860 gccaggagcc gcctcctccc ctaccctggc cctggggtcc atgcccacag aacgctgggc 31920 agaggtgaag gaactgggaa ataccccttt ctcaaatgac tttggggtcc gcctggtctt 31980 ctctactccc tcccacccta cctgcactgt tccctggggc ctgcagtttt agcaaagttc 32040 cctgcccccc acccgtgtca ggaagcagtc ctgatgccca ctcccaccct ttcccctctt 32100 ctggcccatt ctctctcccc actgggtcac tgacaggaat ccctctcctc cctgctgtcc 32160 ctggagcctg tctgctttcc gcgctcagcc tcctccctga ccgcttctcc ctccttatcc 32220 tgaatctccc gactcccgtg gggtggcccc cccacccatc ccggggccca ggcaaaaccg 32280 acaaaattat ttcaaatggg aagatctgaa ttccactgaa gtcatgacaa cagaaaccct 32340 tgtggtcata ttttaaagaa tgatagtagc aactatccat gtaaagtata actagcctta 32400 aaaataccca ccactttgcc aaaaacaaca acaaaaccat gttatagctg aaaaacaaat 32460 tcaataaact tacaggatac aatattaaca tgaaaatcag ttgcattttt gtacagtaac 32520 aacaaagtat ctgagaaagg aataaagaaa acaattccat ttacaatatt atcaaataga 32580 attaaatact taagtcatta aatagaatga gtttaacgaa gaatagtaaa gatctgcata 32640 ctgaaaacta taaaatgatg ataaaaaatt gaagaataca aaatggaaaa tatatcctgt 32700 gttcatcgat tctaaaaatt aacattgtta aaatatctat actacataaa gttatgctac 32760 agttaaataa atttctatca aaattttaat gccattaaaa taaatgtaaa acaaacattt 32820 gtaaaattag tatggagcca caaaagaccc caaatagcca aatattgaga agaataaaaa 32880 ggctgaaagc ctcaaacttc ttgatttcaa acaatattac aaagctatac tcatcaagaa 32940 agtatagtac ctacatagaa acaaatggaa tagaatagag gacccagaaa taaattcaca 33000 catatacagt aaactgatcc caccaaatta ggaaaagata gacacatcaa gaaatggtgt 33060 tagaaaaaac tagctatgca cacacaaaaa agtaaagcct tctcttatca caaaaatgag 33120 tttaaaataa agacataaac attataactg aaataatgaa tcctctaaaa aaaaaaacag 33180 ggagaaagct ccttgacact ggggggggca nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 33240 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 33300 nnnnnnnnnn ctaataataa cctgtacttg aaatttgttt acaaaagtag atctcaggtg 33360 ccctcaccac atacacagaa acgtataagg taactatgtg tggggataga tatgttaacc 33420 agcttgaggt aatttaaaga tatatacaca tctaaaagca ctctattgta caccctaaac 33480 atacatactt ttcaatatgt aaatcatagc tcaataaatg tggcgaaaaa attaagagta 33540 atcagtaggt catatttagc cacaggtaca ctttatttaa aactctcttg tgaactgttt 33600 ctggctgaac ctgggaacgc ccccagtact tctgacccct tgacttctcc aacactactt 33660 ctattcttcc tgtagctggg gaagatgtga tagttcccaa gagttttatg gtttccaggc 33720 catgcagatc tgggagggcc atcagccgtc cccttgctca ttgcgcacag ctccaggcaa 33780 cccatgtcag ggcatcgcat cccggcgagg gccacagcag atagcagggg cgaggtagct 33840 cttggctgcc tgcaatccaa tggcctctgg ctccacaggc gtccttcaag gccctaccac 33900 agcccctcct ctcatgacca gtgaaggcaa tgtaactgca gaagacactg aggaggcaag 33960 taagaaaaaa tcttggcccc cgtgtattaa ttatcattag atcacccttg ttgtatacag 34020 gatcactctt tatcattact gctgtaaaac ctgctacaat tggaagagaa aatgatctgg 34080 agcatactct ttgaattcct gaagttgctg tcagtaagtt tcagcaaaaa tttattctga 34140 ccatgattta caaagttatg tccttgtata tcaaggcagt caaaatgaaa caattgttaa 34200 tggaaaacgg attgttcggc tgacaactaa atgtgagcct tatgtaccta agcatagaca 34260 tgaaatgaaa attggagaga ctaggttacc ctttcacatt caccctggca gtgatacttg 34320 tgatggctgt gaaccacagc aggttggaac tcactttcac cttgataaga gagagaattg 34380 tttattgaat taattgttta cagctaagta ggggaaaaaa gagttggaaa taagaaaata 34440 atttttaaaa tgtgagtaaa gtatgtttta cattatacag actatgaaga tgaatagaca 34500 ttgaagaacc tgaaacataa agatacagct ggaaaaagta aggagcctat tggaagtgaa 34560 gaacacttcc aaagagttga tgcacctgca tctgttcatt ctgaaattac tgacagcaac 34620 aaaagtcaga agatgtttga ggaggtgggt tggaaaaaag aagagggcct ggggaaggat 34680 ggtggaggaa tgaaaactcc aattttgctt tagctttagc agacacacgt aggcttgagg 34740 acaggcaatt cctcctcaat tgaagatgtt caccttctcc aaaacaaaaa caaactggga 34800 caaagcatga gataggtttt ctgaaaattt cccagaaact aaaccttgaa aagatgacct 34860 agggactagg ccttgggtaa aaaggggctg tcgagtgagg gttagtcata gaagaaaact 34920 caagtttttt aaaaaataga gtttggaaac tcttatttat tttattttat tttttgcaga 34980 acttttctcc caaaaagagt ctgtggcaca gtttacccct tcctgattca gaaatgtgta 35040 ataaagtttg gtttgcaact tttcaatgcc atttttttaa actaataaat agtgatttaa 35100 tcaagttatg cagtaagtgg actaaagttt acagggcaca catgaagtgt gtcacacttc 35160 attattttat cgtgtcgtct atgacatccc tgtgagcaca aagccctcta tgcacaattc 35220 atattaccac tactgacgtc aatatacatc ttgtctctgt ctcctccttt cccagcanga 35280 cccccagcag nagaaatatt caaagtgtta aaataaatct gctgtatgca tcctnnnnnn 35340 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 35400 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnccattt acaatattat caaatagaat 35460 taaatactta agtcattaaa tagaatgagt ttaacgaaga atagtaaaga tctgcatact 35520 gaaaactata aaatgatgat aaaaaattga agaatacaaa atggaaaata tatcctgtgt 35580 tcatcgattc taaaaattaa cattgttaaa atatctatac tacataaagt tatctacagt 35640 taaataaatt tctatcaaaa ttttaatgcc attaaaataa atgtaaaaca aacatttgta 35700 aaattagtat ggagccacaa aagaccccaa atagccaaat attgagaaga ataaaaaggc 35760 tgaaagcctc aaacttcttg atttcaaaca atattacaaa gctatactca tcaagaaagt 35820 atagtaccta catagaaaca aatggaatag aatagaggac ccagaaataa attcacacat 35880 atacagtaaa ctgatcccac caaattagga aaagatagac acatcaagaa atggtgttag 35940 aaaaaactag ctatgcacac acaaaaaagt aaagccttct cttatcacaa aaatgagttt 36000 aaaataaaga cataaacatt agaactgaaa taatgaatcc tctaaaaaaa aaaacaggga 36060 gaaagctccc ttgacactgg tggtggcaac gatgttttgg ctttgacaga aagaacacaa 36120 gcaactaatg caaaaataaa aaagtggagc tatatcaaag taaaaacttt ctccacaata 36180 aagaaaacaa tcaacaaaat ataaagacaa tatatgggat ggaagaaaat atttgtaaac 36240 catatgtagg ataagatgtt gctatccaaa atatacataa cactagtcaa tgtggaaaaa 36300 aaaactcaca gcagaacaaa acaaaaacca atttcctgat aaactgggca aaatatctga 36360 atagatgttt tcccaaagac atacaagtgg ccagcaggta tataaaaaga ttctcaacat 36420 tgctaattat caaagtaatg aaaatcaaaa tcacaatgag atatcacctc atcgtggtgt 36480 tagcacagct attatcaaaa attcaaaata caaaaagtgt tagggtgcag agaaaggaga 36540 atatttgtcc accattgctg aacatgctca ctggtgcagc cattataaac aaaaaaacaa 36600 cacaaaaaaa caaacaagaa aaccagtagg gaggttcctt aaaattttta aactagaagt 36660 atctttaatc ccaatttgga gtatacagcc aatagacata aaattagtat caccaagcgt 36720 acctgcactc ctctgataca aataaatagg taaactgtga gaaagagata atttcagctt 36780 taataaagaa tgaaattgtt tcatttacaa caatgttgat gaaccttgaa gacattatgc 36840 taagtgaaat aagcgaaata cagaaagaca aatactgctt gatctcattt taatgtgaaa 36900 tcttaaagaa agaaagaaaa agaaagaaag aaagaaaaaa tgaaacagag actagaatgg 36960 tagttaccat gggctaagaa ttggggaaaa gtggggagat acccatcgaa gggtgcatac 37020 cttcagttac ataataaaaa acttctgggg accttatgtg cagaatggtg actatagcta 37080 ataataacct gtacttgaaa tttgtttaca aaagtagatc tcaggtgccc tcaccacata 37140 cacagaaacg tataaggtaa ctatgtgtgg ggatagatat gttaaccagc ttgaggtaat 37200 ttaaagatat atacacatct aaaagcactc tattgtacac cctaaacata catacttttc 37260 aatatgtaaa tcatagctca ataaatgtgg cgaaaaaatt aagagtaatc agtaggtcat 37320 atttagccac aggtacactt tatttaaaac tctcttgtga actgtttctg gctgaacctg 37380 ggaacgcccc cagtacttct gaccccttga cttctccaac actacttcta ttcttcctgt 37440 agctggggaa gatgtgatag ttcnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 37500 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 37560 nnnaaggatg gtggaggaat gaaaactcca attttgcttt agctttagca gacacacgta 37620 ggcttgagga caggcaattc ctcctcaatt gaagatgttc accttctcca aaacaaaaac 37680 aaactgggac aaagcatgag ataggttttc tgaaaatttc ccagaaacta aaccttgaaa 37740 agatgaccta gggactaggc cttgggtaaa aaggggctgt cgagtgaggg ttagtcatag 37800 aagaaaactc aagtttttta aaaaatagag tttggaaact cttatttatt ttattttatt 37860 ttttgcagaa cttttctccc aaaaagagtc tgtggcacag tttacccctt cctgattcag 37920 aaatgtgtaa taaagtttgg tttgcaactt ttcaatgcca tttttttaaa ctaataaata 37980 gtgattgaat caagttatgc agtaagtgga ctaaagttta cagggcacag atgagtttgt 38040 caaacttcat tattttatcg tgtcatttat gacatccatg taagcaaaaa gccatataag 38100 caaaattcat ataaccacta atgacttaaa tatacatttg tctttgtctc catatattca 38160 cagtaagacc cacagcaaaa gaaatatcaa aagtttataa aaataaatct ggctatatgc 38220 attcttgttt atgcccttta gaacctagat aaaaggacct ttataataaa ggtctaaata 38280 ataccattta aagcctaaat aatactattt aggctaaata aataaataaa ggtccaaata 38340 atactattta aagcctaaat aatactatcg aagaactaac gaacaggtga catactatag 38400 aaaagtagtc ttttactgtt ttcttctgta aagaatctgt tgttttgtgc tatatattca 38460 gcatttatat ttcatttgtt tcatagctaa tgaaatattt agatatgaac aactgagtac 38520 agtactgaaa tagtgcactg gcatttgtaa tttttataaa tattattgca ggcagtggag 38580 ttgtgccaga gaaatctgat ttctagtaca aaaggaatac ttagctaggg cctgaagttt 38640 aagatattta ttgaacatgt cctcaattgc aatataaaca ttataacatt ttttaaaaat 38700 tctttttaat acattctgaa ttaaacaaaa atttcaaatc atacctttta tgaagttagg 38760 gaagtgctaa tgaggtagaa tggcagttga agccaaaaca tctgaattta tgtaaataat 38820 tttacccaca ttagtttttt gttaaaggaa acaataattc tcaccatata ctatttactg 38880 cagtgaagta ttttcagaaa cgtgtgcata atacataatt aattttctaa tggtaataca 38940 ttctacaaat tattacaatt gtggtttctt gaagagagca gtatttcaaa tgaagaaact 39000 tggaatttct catgagggga tgatcatgag gatgatcatg aggggatgtc aaatccatgt 39060 gggttgttca gtttctaaca agacaaatgg aataaacata gatgaactga ctgcaggcat 39120 ccccagtgtg ctgtcttcca tatgtgagaa caaaattata aattatgtta cacaaaagga 39180 gataagttac tagtgtgaag cgttagataa attatatggc acataaaact gagatatgag 39240 gtttgtattt acatgattga caatataata aaaaaagaaa caaaaataat gtggaaaaga 39300 actacgagat tgtaaaatca tatttggaaa agaggcaaat ggaaagtaaa ctattgaaaa 39360 tgtagtaaaa ctactgtaca gcatacaaat tacacattaa aataggctga gctgatgagt 39420 aggactaata aactgaaacg ctgagcacaa ttaatgtagt catatacttt ttagaaggca 39480 aattaataca aaatacaaat atatttattt atatatgaag attagctggg aagaaatgaa 39540 aaataatttg ttttactaga ctatataaac aggaaggtaa tattcaaaaa atcagtgact 39600 cataagtttc agaaattgga aacacatgaa tcctatcaaa gaaataatgg ttttggttat 39660 tgtgaatact gcttcaataa acatgggagt gcaattgtct ctttgacata ctgattttat 39720 ttccttttga tatataccca gtggtgagat tgctgggtcg tatatatggt agttctattt 39780 ataatgtctt gaaaaatttc catgctgctt ttcctacctg gccggggctc cgacagctgg 39840 gcatccggcc gcagtccctc tctcaggatc cctccaccct ccgcctccca acagttcggg 39900 cttttgtgta cgctgtggct gctgcttctg ctgccgaagc ttggcattgg agacacctcg 39960 tcctcctctc aggacagatc catgaaccca tcggcagcgg cggtgagcga tgccttccct 40020 ctgccacaag gcgccgcctg cagagcctgc cgcgtccgcc acgcccagag cgtacccgca 40080 gctcagcgcc gagttactcc tgctggcggt ggccagggag tgactggagt cgccggacac 40140 ccctggggag caggcgaagg aggagctgca gccaccacgt cctctctttc cccagggatg 40200 tgcagaatta ccatgaaatt atgactcctc atcctaagaa ttaccaatgg gaaaattgga 40260 gtctagaaaa tgttgccatc attttagccc accggttccc caatagctgt atttaggtga 40320 taaagtgctc tcgaatgcat tgcacagact cagttgccat gacgattttg tgaaaagtaa 40380 catgtttggt tccccagaac acaatactga ctctggagct tttaagcacc tttgtattat 40440 tagttaatgc ttttaagtca taatagttta tcaaagaaaa atttgaacgg ttggaataag 40500 gactccacgc atgtaattgc agtccagttt cctgatatta caaatcgttt ccagggagaa 40560 aaagagagga cctgtgaaaa acctgatgag ttggctatga gtttttatcc accatcacta 40620 aatgatgcat cttttaattt gactggattc aataaagatt gtgttgtttt gaatcagttt 40680 ctttttgaat tgaaagaagc caagaaacgc aaagacatag atacttccat taaaagcata 40740 agaacaatat attggctgga gggtggtcat tccgtaggaa gtaatacctg ggttacttat 40800 ccacaagtct tgaaagaatt tgtacagtga gagattattg ttcacaccca tgtaacactt 40860 caccaagtac atgatccaaa gagatcttgg attggaaaag agcaaaataa atttgcttag 40920 atacttgggg atattggtat gcaggtgact agctgaattc atttcatgaa ggaagctccc 40980 tgtatagaga atccctttag agttcatgaa gtagtttgag gctacaaata tattgatgta 41040 cttgttcagt ggaagagcat aagcactttg agtgttatga attcagataa tggaatgtaa 41100 ttcataggtg cattgtcagt atgggggaaa cacacgttcc tgaaatatga gtgaaatatg 41160 caatagtatt tcttccttgg gaatgtgagc agtttttaat ttgtgttgag ttagaattag 41220 ttaatttaaa atctaacaag gtggtttgtg ataatactga ggagatataa gacccttaaa 41280 aggaaagtta caacatagta cttctagaat ataactaaaa ttgtttctgt tggaaatagt 41340 gattctctga gtaatgttac taatcgtggt gatattttaa cagtaattag ctattttggc 41400 acttaaaact tgaatggaaa cagtttattt ctcttcaaac aaaagcaaag gcacaatgtt 41460 gtttcctatc attttggaat aactgtaccc tgcctcttgt gttttgtaaa ctcattcact 41520 cattctttaa tgtgccacca agtacttttt tcttgagagt caaaatatat ttgtttcaca 41580 atgtccaaaa atgtgcaaga atgtaaagct ggtttttaaa aacatagcca tgtgatggca 41640 tgtgccgtta gtcccagcta ctcaggaggc taaggcagaa ggatcctttg agtccaggct 41700 ataacgcacc atgattgtgt ttgtgactag ctactgcact ccagcctaag caacatagtg 41760 agacctcatc tcaaaaaaaa aaaaaaagaa aaagaaaaga agaaaaaaaa gaaaaacaga 41820 acaaattaga cttaagtacc atcacttaat ttttagttga cagtctttag ttgattgttt 41880 tggataagac attctggggc ttcttgaatc ttggccaaaa accagttgtt tttgaaaact 41940 gttttaaatt aagcatattt atgtattttg gataaaaatc aactacaaag aaaattttat 42000 ttttttcatt atattagtct ttttgaaaga gaacaactta gggaagataa atatataatg 42060 ctctatttgt caatgctgta ttaaaaagga aacagatttc ataaatctaa atcaatgttt 42120 ctccacaatc atgactttgt ctcaaaaaaa aagttatttt tggccaaaat gcaaaattat 42180 attgctgtga caaaagtcac aaggaatcgc ttaaacatca tccagcctga ggccagatca 42240 acatatgaca gtcacaattt caactctgaa ctgcatccat gtgtgagatt tagagtctca 42300 ttagtggact ctgcccatgt atgaggttga caatcctaat tgttttctac atgtgtctat 42360 gagtgtcaca aggtcactgt ttgctgggcc ctgttatgta actctctcta aaccccaagg 42420 ggtttataaa atacatgtga gtgtcataat cttctgtggc cattttagaa ttagtagacc 42480 aaggacctta gttgttgccg taagcctagc tattaaagtc aaaattactc ctcctggctg 42540 ggtccatgta agaaagctat catcatgact gtggcctggg cctaggtata cgtcacaccc 42600 cacctgtgag cagaaacagg gaggaaaacc acatcatctg aatgctggga cagggaaatg 42660 tcaatattgc tcaattggta ggacccacac aggaaagtca catcaccttt gtgctaggct 42720 cagtgatatg tcacaatgcc cactgtagac agaaccttgt caaaagagtc atatcaccta 42780 ggtgctgagc ccagcaatat gtcacaatcc cccttgttaa cagggcccag tcatgagagg 42840 agagtcctat cacctagatg atatgcccag atttctgtta caaaccttac tggaggcagg 42900 gcacaggcag gaaagtagag tatcatcact caggtgatgg gctcacaggc acattacaat 42960 gtcccttgtg ggtatgatcc agacaggaca gtcacatcag ctcagtgttt gggtcagttg 43020 tatgtaataa tcccaactgt tggcagggct catgcaaggg agtaaagtca atcaagtggt 43080 gagaaaaaaa ttgtatttca aagtcacacc tacaggaaag tccagggatg agattcacag 43140 tcccacacat tttctgactc ctaggataag agacaacagc ttctttgaat tttgttaaag 43200 tacacaaatc acaatctcaa tggtggacag aattcatgca tgaaagctcc aaacacacct 43260 gcaaacactg tctaattaga gcagccacga tctcacaggt gtgctgaatt ttgatatata 43320 agtcaccatt ctacctgtga actgtatcca tgtatgagag tcacaatgtg aactttcaac 43380 tgatctgggt atgatattca gaacctcaac aatggtctat gtccctatag gaagatgaca 43440 ttcctctctg tatgtcgtgt gtgcttggaa gagtcaaaat gtcacctgtg tgctgggcct 43500 ttattagtca ccttctgtgc cacttaatgg cttcatatgg tctgcatgag aatgaaaacc 43560 tgctctaaga tttcatgatg gcataaaccc atgatcctac atgttgccct aagttgaggc 43620 atgagagtca aaacctgtct tatttgctgg ctccatgtat gagagtcatc agtgtgcctg 43680 tgaactgggt tcagaaatga gccaccatcc catttgtgga tggatccaga tatgacagtc 43740 acaattccaa ctgggaaatg tttctgtaag tgagatccag ggcctcgtga gtgggttctg 43800 ttcacatctg taatctttat acaattagga gatgcagaac cttactgatt gttctatgag 43860 agtaaaaata tcttctactg gctgggccct catatgagag tcttcattat tcctgtgacc 43920 tgaacctagg gatatgtcac aatcttacct gtgagtagaa ccaggcagga gagtctcatc 43980 agctgaatac tgagccaggg atatattagt gttccccctg tggttgagtg ctggtaggat 44040 attcacatca ctatggtgct gggacaagtg atgtgtcaca atgtcccctg tgggcagaac 44100 ccaggctaag cattacatca cacgagtgct gggctcatca atatgtcaca atacccttgg 44160 gaatggggcc caggcaggag agtacaataa catcatctgg aaatgggatg aaaggaatat 44220 cacaaagcca cctgtgagaa gggactaggc aggagggctg cattactggg gttatgggta 44280 tagtatatgt cacgatccac actgtgggtt atagagagac aatatattca tctcagttta 44340 tttgcgggtg gacaaaagta tatgtcagtc acacatgtgg gaaggtctag aaataaaatt 44400 tacattccta cagatgnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 44460 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnntaat 44520 ttctggctgg acaaagtata tgtcagtcac acatgtggga aggtctagga ataaaagtta 44580 cattcctgca gtatgttctg ggtccaggta taaaagtatg cacctcataa acttttacac 44640 acctcatttg cccaagtttg caagtcatga tatcaacagt aaattggatc catacatgaa 44700 agactcaacc tcaaacaaca cacattagaa gagttagagc ctaacatatt tgctgaatct 44760 tagtcagaga ctcatcatct cacctgaagg ctggacccac atataagagt aataatacca 44820 cctttggact gcctttgggt gtgaaattca gaaattcaat ggtgggctgt gtccatgtgg 44880 gagtgtgaaa aactttactg tcaggagtgt atgcaatgag agtcaaacct tcaacactgt 44940 actgggcctt gttttattat tctttgcacc atctgtgggt ttttatacta cgtgtgagac 45000 ttgcaatcca ctctgaagca tttgtccttg tataaaccca tgatcttact gttagctaag 45060 tgtgcagaca agagtcacaa ttttgtttgt gttatggacc ctgttatgaa actctcttta 45120 ccacctgaga gcaatgtgca atatgcttga gtgttgtagg gctctgtgac ttttgtacaa 45180 ttaagaaacc cagaacacta cctgttgccc taagcctagt tactaaagca aacatctatt 45240 ttattggctg ggtccaggta tgagagtcat tactggacct gtgagctggg tccagaaatg 45300 agtcaccata ccacctgtgg ccagatttac atatagcagt cacaattcca actgtggact 45360 tcatctgcat gtgagattca ggagctcacc actgggctcc cttcatgcat gagggtgaca 45420 attctaaatt tcagcaggat gtggctacaa gagtcacaac atcacctgtg ctctggtttg 45480 tattatgaaa ctctctgtac cacccaaagg cttcataata tacgtgtgtg tgtcagaatc 45540 tcctgtaacg ttttacaagt aggagactca ggaccttacc ctttgcccta aatctatcta 45600 taagagtcaa aatatcttct atcagatgga tccacatctg aaaggcatca tgattcctgt 45660 gtgctgggcc tagtaaaggt cacaatctta gctgtgggtg aacagaaggc aagagagtca 45720 catcacctgg gtgacgggca agagatatat tacaatccct tctgttgaca ggtcccaaac 45780 aaggagttac atcacctgga tgcttggctc tgtgatatgt cacattgccc agtgcaggca 45840 gggcacaggc aggagagtca cattacctgg gtgcttggcc cagtgctatg tcacattccc 45900 ttacttaggg agggcacagg tagctaaagg gagtcatatt acctaggtac tgggctagag 45960 ctatcagtct gttggtcatt cagggtgagg gttatgatct ctttagacat ccatgtaaca 46020 gagccccaac tcgttcatcc tcagaagcaa aggccagtgg agaccaaata aggcttgaaa 46080 gttaagtcct ctaaaaaggc agaatgtcct aggctttggc ctctcctctt taagcacaac 46140 cagatcataa cctctctttt gacttctcac tatgccgctc actggaacaa ggctcctgac 46200 tccacaccct cgtcttttca cccctgtaca tttattgggc agaccaaaga cccggagacc 46260 tgaactctgg gacagtctgg ggagaagcga atccaaaaag ggttggggca gcaacggctg 46320 agctcccaca gatgccccgg cccaaaactg caggcccagg ctgtggtcca tgtggcactc 46380 agtgcaggtt ctgcaatgcc actggaaggc ctagaaaatc cacttcagct gctgcttctt 46440 ccttttctta agcgctgaaa ggtatacact ttcccaagct tttcagatac ggccaggcac 46500 ccccaactga aaagggggat aatcgcattc ttcatcatgg tttatcacct cacccaagca 46560 ccagccagcc ccagcctcgc caatccagct gccccaggct ggaggagatc ctgtcaagcc 46620 tcctctctga aactccactc ctcctctttt ttttcttcct caaccactgg catacttttg 46680 gtcatcctac ttttcatccc aatggggcac ctgataccat aaagacctgg gaaccccact 46740 gatcacaggt cctacagggg tcaggcaaac atacaggaag tccagaagga cagtcatctg 46800 ccacctccag tatgagaaga acaaaacccc cgcactccat gaagaagtca ggaacccagt 46860 gaaggctgca ggtccagccc acaacaacct atgtatgtgg ggtgctccat ggctcatacg 46920 tgtgtcctgc aggctggagc ccaagcgagg gttcgcagtc tttgttccag atggggaact 46980 actacttcag cttattaccc aggctccctg atggggaaag gaaggaaatg gcgagaatct 47040 tcagcacagc aaatcccatc gtcctggcag agacaactcc tgcctgcatg ccacaccagc 47100 agccgcaggc tagaggcagt cacatcaggc ctccctctta caacattttg ttttgttttg 47160 tttttcgaaa ggccctaatc acctactttc ggtcatcctc cttgtcacaa tagggcacct 47220 gacgccatag agactcgaga aatgtacaga tctggggtcc tgaagcgtcc aggcgaaggt 47280 gctgggagtc agggaggaaa gtcctttgaa gcctccagaa gcaatgtaac aagacccagc 47340 actgcaggaa gaagtggggc acccagtggc aaatgcaggc ctacgtgcaa ggtgctccct 47400 gtccctcgcc tgctccctgg aacctggagc cgcacctagg atggggggtc tccattccag 47460 acccagaata agctcttggc tacctcaccc agcacctggt catgccagac agaatcttgg 47520 ctggggcttt ctcatggtct gcagtgcctt ccactttggg ccaccgtaat aacacttctc 47580 ccacagggac tctgcagtgt ccagctcacc agcctggagt ttctcatcac ccagtgattc 47640 caaaagaaac aatctacaat ggcatgagcc caagtgccag aaacaaacaa acaaaatatc 47700 tgaggtaagt ccattggtta cttcgatttt tcaaaggtaa catttgtccc tccttgaaat 47760 cctaagagtg catgaacagg ctattctaat ggacgtgaaa ttcacattaa aactgattga 47820 cagatgaatt ctgattccaa ggtactgttg tattctcaaa ttgcatctgc ttaccctgcc 47880 cccctcaaaa tggaagagtg atgactattt gtcatcttag cactgtgggg atgcagagcc 47940 ttagatggaa gtgtgttaaa aacaacattc cataaagggc tgtcacttcc aattttcaag 48000 caaggtggga aaccaaccgt gatgtatgaa acactgctgg ccaatgtgca caattagtga 48060 tacaaaattg aataatataa actattgtaa atatgtgtgg acattgtggc aatttggaca 48120 ccaaataaca ctgccaaatt caggaagaga aagctgtaac ctcatggttg ttatgaagct 48180 taacttctgg gtactagaac ctatcaaaat ttctccatgc caatctgtcc aaacaaacat 48240 tgagatgttt atttctataa aattcctata gaatcttcct gggcatgacc cttctagccc 48300 catcctcaca gcactggtgc cagaggaact gcacttgctc ctgcctcccc ctatattttt 48360 aggagtggaa atatttagtg acaggatgga ttgggagggt actgaggcct cctgggtggg 48420 tgggtcagat ctactgcagc ccaagccttt ttataataaa taacttatag gtaaattaga 48480 aaagaataaa aaatgcaact cattctcttt ctttgctcac cacaaagcag taaacacagc 48540 atctagagat gctgttggag ccaggcttgt cctgggaaag taagaagtgc ttatcaggag 48600 ccctgggatt gagggcaggt gagatgggtt cccaggaaga tacagccagg gtcaggtctg 48660 gcccactcac actggaaggg gccttctgaa ggccaggaaa aatggtccag gtcactgaaa 48720 ctcaagcggc ccatctgagc caaagatctg tccaggtcgc agtgaatctc atcagcactg 48780 ccaatagggg gtctgatcag ccctgaagag ctccatcaat ggaggcagat ggctggtgga 48840 tgggtcagga gagctgccta ctgccaatgt gggagtccac tcggtgtggg tttactgcgc 48900 ctctcagtgg cctggataac ctgctgaggc tgcagcttct tcccgttttt gagcaaacgg 48960 ggacatgtat cattcccgaa ggttttcaga taatccctgg tgacccctgg caggggatgg 49020 ttatcttggc gatccccagc caggcttatt gacttgcccc caggcaacac ccagcaatcc 49080 cgcgcccacc caggcactta agcatttgtg gtaagcctct ttctggaagc tcgccttctg 49140 cttgcccttt gtcttcccca ccccagtact tctggccatt ctcattgtca tcacaatgag 49200 acaactggct cctggagact cagaaactgc catgcagacc ttgagttttt cctaggcccg 49260 gccaacaggg tggagaatct tccatctcct aaaggagcag aacagaccag gcatgtagga 49320 gttgtgcccc tgtgcaggct gcaggagcag ctcacaacaa ggcctgaaca tggggtgctc 49380 ccttgtttgt ccttttttcc tggaggcttg ggcttgcctg gccaaggttg cccttcagga 49440 cagggaatca gagctttttg tggctgcctt acccagccag ggcagcacca gagagaatcg 49500 tggctgaggc tttcctgtgg gctgcagtgc ctgccccttt agggtttcca caaaaacacc 49560 ttccctgcaa gtaatccact gtgtcctgat taccagacga ggcttcctca tcatttggtg 49620 attccaaaga aaataatcta cagcggaatg accaagttct aaaacaggag cacacaaata 49680 atctgtggaa agtctgtttg tcatctagat gtttcgaata tacacatttt cccttcttgt 49740 tatattcagt cgtgcatgaa actgatatta taatagatgt gcaatacact aaagctgatt 49800 aaagggtgaa ttcttattct aaggtacttt gtggtctcaa atttgtctgt tcccaccccc 49860 aggactccct taaaacagaa tagttatcac tgagagcaga ggtagaaaga aactagctag 49920 gcagatagag caaagagtac tcagcgtaac atcccttcta atgaaaagca gcccaaaaaa 49980 tcacatctct ttaacaaaga gcaacctgta agttcgggct gcaatcatag ataagtaaga 50040 tggaagcttg tatgggcagg gatggctgca gcttcatgga tagaaatgtc cagcttgggc 50100 tagatacatc caacatgggg gctccactcc tctttgtagc acacgcacca taggaaagag 50160 ataagcaact tggagtagct caaaagtcac ggagcctcag tgtcccttct gtggagccca 50220 gaacctgatg caggtctaag tcctgttgta tgaacatgtc ctgaccctgg cggccctggt 50280 ggtggtgcag cataggaagt ataagggatg aggtctagtc atgggccatg gagcctttct 50340 cattaatctt ggctgtctgc cttctaggga atataatcaa cactaataaa ggaggaaggt 50400 gagcagctgg cgctgtcgct ttgagggagg atggcgatgt gaaagtcagt gaccaccgtg 50460 gggaggacac tccctggctc catcctctgc atcttagatt tattgggaca gtttgataca 50520 cagagaagga ggagacccat cccaatggag ggtttgatta gatgaatata atcaatgata 50580 aattcctaga ggagggactt tttataatca actctgagaa caggttggag ctacatggga 50640 ttggagggga gggtggagcc ccttaaaaga aaagccccag agactgcccc tgccctctct 50700 ctcccccaca agttccattt attatcttcc acccaggagc tgtcagaatc ctgcccttcc 50760 gtctccagat caaagtcctt caggaaatgc aactacttca gtgacaagag ataattatca 50820 tcttctgaca gaggaggaat ttggggtttg gtcccagtcc atgaagtggc acagtcagaa 50880 taaaaggtga gagcttagga gattagcgga gggtagaaga acactctgtc ttgtgaccag 50940 cttcagagag cctggggcca tggcttcctg gtcaacatta ggccctgctg catggtgacc 51000 cctgggcagg cagtgggaag cctgaggtgt ggctcctggt ggcctcacaa ctgccactct 51060 ttcctgaagc tcctatttgt tctgtcagct aagcccccat cccagtaggc cagcaacaca 51120 ctcaagacca agaacaggcc atggtgaatc tcagggccac tgagtgcctg ggctggcagg 51180 ggcagagttc ctcagggctc agtgacattt ggactgagca tgggctttgg agtcatacag 51240 ctacactgag ctcccagctt caccgtgatc agccctgtgt ctgggacagg ggcctcactg 51300 ttctggaact tgagacgcca tagtcataaa tttaacacac acttctaact gctttttctt 51360 ttttatctgt ctttctctat aatcaccatg tactactggt ctcttgtgtt tatttaaatt 51420 aataaacatg ttacacagtg tgtattattc ttcctcatga gttctttact atattctatg 51480 attccaccca tatgaggtac ttatgtaatt tcattcatag aaactcaaag tagaaaagca 51540 gttagttctt agaggacaaa aggaaggtaa agggggattg ttgtttaaca ggaacagagt 51600 ttgagttttg caaaatgaat aaaattccct gtgaatgtgg atgatggttg cagaacaatg 51660 tgtgattaat tcctctgacc tgcactttta aaaattgtta aaatggttaa ttttatgtat 51720 attttaccac aatgttaaaa aggacttttt aaaatgaacg gactatagat atctgcaaca 51780 gcataaataa atatcacaaa tataatctta catttaaaaa ttgatgtaaa agtatccata 51840 ctctgtaatt tcttgtattt aaatccaaaa atcaaaactg aggttctggc ttccactaat 51900 gatgaagtag ctagtttaac taacaatctc acagagaaaa atgatgaatc ccaggtaaaa 51960 cattatatgt tattatagaa acacttctat atataataga tatatgaaat atgtgtgtat 52020 aaaaactgaa tgcatatttc agctgtgccc tccatagaag agagaagtat tgaagttaga 52080 agccagccca attaacaccc tctttaaaga caacactctt caaacggaca aaacagaatc 52140 cagagtctct ttaactcttc tatacagtct ctagtgcaca attttccaat tcaggagatg 52200 cgtgaaaaca catgaaaata taatacatac acaagataaa aagcaggcag tagacatctc 52260 caagatgtcc aagacataat cagcagacaa gaatttgaag gcagctatta taagcatgct 52320 catgggggca aaggaaaata ttctcataaa tgaacagatg tggaacatca gcagagaaat 52380 gaaaaatgac cacatagaaa aataataaaa ataattgtga gcttttctat atatcagaaa 52440 cagaagacat agcaatataa tttaaccaat ctgaggatag aagtaaaaat agtttaaaag 52500 aaaatgaaca gagccttaga cctatctgtg ggatgattga ttctgagaag aagagagaga 52560 cagaaaaaat taaatggggt agaaaagcaa atcaacaaaa tttaaagaaa ataaaccgag 52620 gcttagagac ccatggaatc attcagtctg agaaggagag gagagagtca gaagaattaa 52680 aagggttaca aaaataaatc aacaactaat atctgaaaac tttcaaaaat tgttcaaaaa 52740 cctaattctt ttttaatcta aagatccaca aaccccccac aaaaatacaa ataaaaccat 52800 accaaggcca tattgtgatt taagaaacta gcagacagga ctttcaattg acttgatatg 52860 atttattatt tttactactt ataagaatgg aaataagttc tccttagttt ttttcttgga 52920 gaaagtctga catgtgaggc acagatgagt tattaaaggc agatgacttt ccagccttgt 52980 cttaaatgtt ccattcttta ccttagaaat tatttaaatt tgtgtcttcc aaatactgta 53040 gtaatattga tgctccaaag agatgtccca cggagattct gctcttgtgt gtccaccctg 53100 cagggagctg aggcagtttc ttatgacagt ttcagaagcg agtagtcgtg cagtacttaa 53160 tctaaaaaac ttaatggaaa catgaattaa gagaatgatc actgtttagt tctatcagca 53220 aactattaaa agtgatccaa aggaggtatt tataaagaga tattaaaaga tttttcaagg 53280 gagccttatt cagggcagaa acgcagacac tatcgctgac ctcaccacag aaaataccct 53340 catgggttgg gagggaccaa gggacgctct ggtcctgctg acctgcatta atcacagcca 53400 ggaggtccac actagtacca tgaggcctgg gaagcagcct gcgtggggtc agagaagtgg 53460 tggatgtggc tcccaaagtg gcttacgggg tcccttccct gtggctgttt ccttactgga 53520 tgcagcaggg tcaggccctt cccctgtgac gttttctcct ctttatcaca gtggcgggag 53580 cgtccccgtg agaggcccga cccaggtgtg ggccacgctg cgagcccgag gaccaagcgg 53640 cactcctggg atgcagagga ggatttgtga cagcttaggg aacagaaaaa atggtttcga 53700 aaaggctaat ggcaggtgac taaggacacg atgttttcat tactggcagt gaactgacgg 53760 tttcatacac taacaagggg gcttctcgag gggatcccaa ggagcccaag aactgccagg 53820 tcgcccacca ttaccctacg cctaagggac aggctgcact gagcatgtct gaaacggtag 53880 gcccgttagc cccaccccta ggaacgggtg cactccgcat gtgtaaaagg gcaggacctt 53940 taccccaccg ctagggacgg gctgcactac gcatgtctga aagggcgtga caagagggag 54000 gagcaagaag gggcggggtg gagggggagg ggggcaagaa aaggggcggg gtgcgcccaa 54060 catccggcgg agaagtatta ccatggcaac cctcccgcgc aggccataag aggcaaatga 54120 acctttgttg gttggcggga aatcgagacc ctggcaaggg ggcttctccc ttggagaagc 54180 ctgagaactg ccaggtccgc ggccgttaac ccgcctctag ggacggccgc actgcgcatg 54240 tctgaaaggg aactagaaag ggaggagcga gagagggtgg ggctgaggag gaggctgtgt 54300 gagaaaaggg gcggggcgcg cccaaagtct ggcggaagag cgttaccctg gcaaccctcc 54360 cgcggaggcc gagagaggcc accggccctt tgttgatctg cgagaaatca aactacgaac 54420 acgacaagca ttagcctgca gctcgaggag acaaggtgtc acaattacaa ggggaaacta 54480 gccgccctag ctccactgtc tccccagcac gagagatttg agaacagaaa ggcttccctc 54540 cgcagggcga aactgctggg ctgcctggaa gggcgaggca gggagcggaa ccgtcttcag 54600 gaaatttcgg gagttccggg gtcaggtcca ctccccggct gttgttgtgt tgttggcagg 54660 gcagagggtc taggatgcca gcctgctccg ggctgcgctg tgcgcctatc ccagggcggg 54720 gggatgcggg gcgacacccg cctcccggtg catccaggag ttgtagtctc ttcaccggtt 54780 ccccactgtg ggtggtgggg ctgcaggagg acaattcaaa ttgagatagg agcggaggcg 54840 gagcgcggcc gtgcagggag gggcagggcg gtgtaggcgg cttcatttac caagctttgc 54900 tggccattgt ttccatgcca aacccttgcc aaggggattg tcaggagagg aacttgaagg 54960 ggaggcgtgg gctggccagt gaggagggtg tgtttttgcg aagtgcgccc cgtctttgcc 55020 gaaattagga gtgtctggtc ctcactcacg cggctctctg ctgctcaggt cgattttctc 55080 tcccaccctc acccaggctc tttccacagc atcacccctg ccccagcccc tggagccacc 55140 tgctttccta agtggttttg gaaactgggc tgaggtccca aagggtgtcc actgtgctgt 55200 tgctctccgc tctgtccaag caaagcacaa gctcagccga ctttgaaaga cacccaccgc 55260 ctggcctggg aatgcacaag ttcagagctt tgcaagaagt gatcatgggc tatggctttg 55320 tgaaaatgtc accctcacca gtgccttttt cgcggacgtg gacgtggagg aatgagggag 55380 ggtaatcact gggctaccaa ggtactgcta agagcagaag agaaaatccc agttttcagc 55440 catgtgtctg gtttgacatt tcaccaaccc atttaagtgt gcaggccccc aaatatctac 55500 ctaaagatta tgatagttta ggcattttac acttgaaatt attgacctca tatccactga 55560 agcctgactg gccagtgtct caaagacaca gatgatgacc tgatccctca ggaacagctg 55620 gtgctccagc tttgtggagg tgaatttcaa ggtatggatt acttgggggg tcgttgaaac 55680 ctgtcaggct caggaacaga tggtgctcca gctttgtgga gatgaatttc aaggtatgga 55740 gcacttgggg ggtctttgaa acctgtcagg tttcatatct ctgctttgtg tgaaaagatc 55800 atcacctaca gtagtcaggg atgtgccttt acttactggt gctcatctgt tgaaacttta 55860 tatctagaca atggaaacat tgaggagcat ttctgctttc atgtagcctc ttaataattg 55920 acgccctaaa gccctgtgtc ctcagggaga gttcctttga ttcctgggtg gtaccaggtt 55980 tcatgctgtt aaatctgtaa aaacctgcgc gttaatctcc atgaatataa gaccttgttc 56040 tttcttaaat gccccaattt tttcttctct tgtcttatat tctgagcagg atttcaaata 56100 ctgtatgtaa ggaagtagtg agagtgggca tctttatctt aaaataaatc ttagaaaaga 56160 tttcaacatt tcaccactga caatgttagc tatgggcttg tcctatagct aataaagaac 56220 gcatctcttt attttgaggt atattctttc tatacctaat ttgctataaa ttttgttagg 56280 aatggatttt aaattttgtc aaaataattt taggcatgca taaaaagtca tgatttttaa 56340 tctttttgtt gtgtaaataa ggtgtatagc atttattgat ttccacatat taaaatatta 56400 ttgcatccca ggaataaatc caacttgatc ataacatcca actgtgttgc ccaggctagt 56460 ctcaaactcc tggattcaag agcccctctc ctctcagcct acgaaagtgc tgggattaaa 56520 ggtgagatcc actatgcctg gccatatata tatataattt gtaaataaaa ataaccttat 56580 atacaaggat aaattcaaat gtccacggtg aggtctgggc ttcagcataa ggaggaagtc 56640 ttgcctgaaa agggctgcag cttggaactt tttaccctgt cgtcatgtgg ctatgagttg 56700 gttcacatct tctgtcattc aggacccgaa ggggtgggac ctggggccct attatcactg 56760 gtgctggggt aaaaactgtc ttaaaactat cattttaatg cttagcaatg ttaattttta 56820 gtgagaaaca agattactta atttaatata accagatttt aagttactaa aaaaaaaccc 56880 taaaattatg acacaggtat ttcctctaat gtttttttgg ggggtttcaa gtacctatgt 56940 catatactga aacctactgt tctgtaagcc ctacccttaa aacaatcttg ttgttattgt 57000 gaacagttac ttagtgtaaa tcctaccctt aggcaaattt atatggtgat ttcaattgtg 57060 cttcacattc ctttcctgtg ttaagtgtct gggtttaggg gttaacagtg ggaggatcca 57120 ctcatcttca gccatctgag acatagcttc tattcataag tccatcttaa atgttccttt 57180 ctgagaaact tgatttgtca gcctcattct tcaacctttc aactcccttg gcttttaaag 57240 gcaggtttac atatacctac tcacaacaaa acaccctcat atatatgggc tgtcaatttc 57300 tataacattt ttatgtggtt caagactgta atgtgtagca catgtagttt tgtatatgga 57360 tagtatattt tatatagtat attttatata gctactttat attacacatc actaaaatac 57420 atgttcagta agtgctcact taacgtcatt gataggtcct tataaactga ctttaagtaa 57480 aacaaaatac tgtgtgccat ggaaaattaa cttgtgtata tcaattagcc aatgggaaaa 57540 ttgggtttat tatatagtat attgtgttac ttaaagtcac agtttcccag aatctatcaa 57600 aaaagtgaga acatactgtc attagtatta tgtagtacat tacagaatta cactgttatg 57660 gtatgttatt gtagtcttag cagttggtag tataatgtgt ttcagtttcc cccaaggtca 57720 cagaattatc caggccaacc aataacacct cctgtgggaa ccaggagcat ctcaccctct 57780 tgatactaca aagccttccc cgacacctcc tgtttgttct ctctgctccc aggtgcaatg 57840 gctgtgtggg tctgtatacc ttacatagct ttctccttcc atgattatat gtaatgaata 57900 actgctgtca atctcatctg tccagtgatt ggtgccatgg ttttaactat tccagtagca 57960 ttagggtggt aatttctccc tcaccaatgg ggtaaagggg aggctaatca aacaattcac 58020 aacactaact ggattaatca cccatgacgg aggacatctg ctcaacttta actgcttttg 58080 gcctactggt ttcatgatac attaaaagtc atctctgtca gagccatcag tttgtggtgg 58140 gcttttgctt tggtctaaat aacaatttgt ggcctttatc atgatttgcc ttccctgccc 58200 acactaaagc acacattacc tagacataaa tattcaatgg actcttctcc cgcgtgacgt 58260 gtcatcatgt ttagcctgtg tttggcaggc agtttgcaag acacttgcgt gtcaaggcag 58320 tgaaaacaca gcacnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 58380 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnggcagg 58440 gaaaacacag caccttaatc agtcagcact tcaggcggga ttaccacaag aacggccata 58500 tacctgtggt cgcttcatct ggtctgctta ctattactaa atgcctgggt agtcatctga 58560 acgttgttta ttattgcaca ctctcaggac agacccaggg atggtcttct gtaaaattgc 58620 aaaaacaaaa gggactttta ctttggagag aattatgctt gtcattcagt tgctctaaga 58680 agtgctactg tcatgcaagc aagaacagtg gccttgtctt ttctgagctg ctgcttcctt 58740 tgctgctgtc acgtgcaatc tcatgatgag gtgtgcattc cttgtgcctt atggcttagg 58800 cactgatgag tcataaaaag gagaaagaga atttaatatt agtccctccc aaaacatttg 58860 gggtaattct ctcactgtaa aacccctact catcatgtga acttttagca ctgctgctta 58920 ccagtatgcc aatggtgcaa tcttaggatc agaagtttga tggactcaaa aaaaaaaaaa 58980 agtaccataa cgaatatggg tgctaatccc agtgagatgg agaggcagta aacaccttct 59040 tccaaagaga agacagattg caaataaaca aataatgata taactcaaga aactagaaga 59100 gggagaactt atcccaaagt tagcagatga tgaaattaat taaaaaaaca gagcagaaat 59160 aaatacaatg aacactagaa aaaaaaaaca gaaaatgtag aaaaaatgta agaacttgtt 59220 ttttttttga aaacataaaa tcttcctatc tttagctagg ctaagaaata aatgctcaaa 59280 taaataaaat tagaattaaa tggaagatat tacaactggc aaaacagaca tacaaaaggt 59340 cataagattg ctgtgtacat agactcctat ttacaatgat atccaaacaa attgtgtaag 59400 ggagaagaca tgaataaatt tctagacatg taccacctac caagactgag ccataaagaa 59460 atagaaaaca tgaatagatc agtaatgagt aaagagttta gatctgatga ctttactcct 59520 gaattttacc aaacatttaa aaagaaccaa atctttgaaa aaaaattgaa gaaacaggaa 59580 tacttccaaa ctcattttac gaagccagca ttaccctgat acaaaaacca gagatagaca 59640 ttacaaaaaa aaaaattaca cgccaatatc cctgatgaac atagatgcaa aatcatccac 59700 aacatagttg caaatgaaat tcaaaagcac tttaaaagga taatttattt gatgtgggtg 59760 ccccccatcc caccttttct taaaaaatga aaaaaaaata gcatggcatg gtcatacatg 59820 tctgtggtcc tacctacttg ggaggcagag gtgggaggga cacttgatcc tgggaggtag 59880 aggttgctgt gagctgagat caggccactg cactgcagcc tgggtgacag aacaagatcc 59940 tctctctctt tcccccactt tgtgtgtgtg tgttgcacac acaaagggga ggggtgaggg 60000 tgtggatgtg tgtgtattat tcaaaatgaa aacaatgaca attatttttg tgtacattta 60060 tattttttgg agaaagagta tcactctgtc acccaggctc aagtacagtg gtgtgatctt 60120 ggctcacggc aacctgtgcc ccctaggttt aagcgattct tctgcctcag cctcccaagt 60180 agctgggata caggtgccca ccatcttgcc aagctaattt ttgtattttt agtacagaca 60240 ttgtttcgcc atgttggcca ggctggtttt aaactcctga gctcaagtga tcacccactt 60300 cagcctccca aagtgctggg ataacaggcg tgagccaccg tgcctgacca atgacaatta 60360 tttttaaatt ttagatttta caatctttct ctcctcttga atttgaggca gcctgagctt 60420 tgaaaactgg caatcttcta ttgcagcaat gtgcagtaga aataaaatgt gagccacgtg 60480 tgtcaaagtt ttctagtagc agcataaaga aaaagaaata tgagtgaaat tgattttaat 60540 aacttagcct aatatatcca aaatatttta acatataatt agaatgaaat cattacatat 60600 atatagatag atatagatat agatatatat gtaactaaat ctctaaaata ctttctgata 60660 tttttcccta gcatatctca gtacatactg tgtacaattt gggtgctcag tagcctgttg 60720 tggccagtgg ctgccacatt gcatgtgcat ctctgagggc tcttgacttt tctgaccttc 60780 gggaggtaaa gggcctgaat tttccttttc tgccagatag gagtgaatgc ctcttctctg 60840 ccaatatcac tcctgtttca agggtaagag aggtggtgca ctgagagaca ggagggacct 60900 acaggaaaca agtgtccaca gatagaccac tgctgtccgc tgctgcttgg tgtggactca 60960 tcactcttcc agaaatcaga cagaagtcca aaaaatgggg acccagaagg gggaaacctc 61020 atgtttttag atctgtccat agccttaatc tccagaattt agatgtcaag agactagatt 61080 aaaggcaaac cttttatctt gcaatttggc tttggcaaat taaaatagaa atagaaatgt 61140 gtcactataa aaatcaatta tatacaaagg aaggcaataa gagcagaaaa gaggagaaaa 61200 tacctacgag aaacaaatag aactattaac aaaaaggaaa tattccattc ttttcagaat 61260 taaaatgaat atatacatgg agtgaaatat acactggaaa tatgtaaatt ggctaaagag 61320 attttaaaaa aacaagattt attttctgtt gtctataaga aactcaattt acatctaagc 61380 acacagatag gctaaaagtg ccagtatgaa aaataccttc taggaaaact gcaatcaaat 61440 gacagcaacg tggatcataa ttatgcaaaa tacacattaa gtcagaactg aaacaacaga 61500 caaagaaaga tgttgtataa tgataaaact gtcaattcac tggggaagct gtgttaataa 61560 taaatatgtg cacacttcac atcagggttc ccaaatgtat aaagttaaca ttgacacaaa 61620 tgaagaagga aatggctatg caaaaatagt aagagacata attaccccac tacccactat 61680 cagtaatgaa taataaagcc agacagaaag ttaacttgag aacagagaat ttgaataaca 61740 ctgcaaactc taaacctaac agacatatag aaaacactag tcacagtgaa tagaagtgaa 61800 aaaacaaaag aaacaaacac tccacacagc aatatcagaa tatacaattt tttcaatagg 61860 tcatgaaaca ttctcctggg ttgatgacct actaggacac aaaacaagtt ttgctaaatt 61920 ttaaaatggt aaaatatggg ccagggatgg tggctcatgt ctatcattcc agcatgttgg 61980 gaggctgaat tgggaggatt gagtgagttt aggagttcat ggccagcctg ggcaacataa 62040 ggagaccttg tctttacaaa atataaaatt aaaaaattaa ctgggcatga ttacacgtgc 62100 ctgtgtgtcc agccactcag caggctgagg tgggaggatt gcttgagcct gggagatcaa 62160 ggctgtggtt agccataatt gagtcactgt gcttcagcct gagtaacata gcaaatctct 62220 gtcccaaaag agattaaaat attacaaact atcatttttg attaaaaggg aatacaatga 62280 gaaatcaata gcagaaaaaa tactggaaaa tctacaaata tgtggaaatt aaacaaccca 62340 ctcttcagca tgctctcgtt aagggtcgga agacaatatt gtgaagatgt tcacactacc 62400 caaaattatc tacagattca atgtgatcgc tgtcaaattt taactgtcat ttcatttgca 62460 gaaatacaaa aaaaattcta aagcttatat ggaatctaaa gtgagtatca agagccaaac 62520 aactttctaa aagcataata ttggagctat gacactcctt gacttcctaa tgtattacaa 62580 aactacagta accaaactat ttggtactga cataaaggca gacagacaga ccaatggaac 62640 agaatagatc acaggaataa actgtcatat atatggccaa atgaggagtt atttttatat 62700 ccatattcat tgcagcatta ttcacaacag ctgataggtg gaaggaaccc aaatgtccct 62760 cagtgaatga gtggataaag gcaatttgga atatacaaat aatggaatat tattcagttt 62820 tttaaaagca ggagatctga ttatttttac actaagaata aatcttgagg acattatgta 62880 aatgaaataa accagtcaca aaaggacaga cactgtttga ctccagttaa ataaaatatc 62940 taatgtagtt aaactcttag aaacagaaag tagaatagta tcagtcagag ccttaggggt 63000 ggaaataaaa gggtagttgt tgtttcatag gtattgaatt ttagttttac aacataaaat 63060 cattttagcg atatgttgca tagcaatgtg aatatattta atattattta actatgtact 63120 taatatattt aagatggtac attttatgtg ttttgagtac attaaaaatg aaaaactttc 63180 taaagagata catatttata acctttttca aaaattacct ccaaatcata aaaatgtcag 63240 aaaaacaata aagaggccag gtgcagtggc tcatccatgt aattgaaata caacaggagg 63300 ctgaggatgg agaatagctt gaggccaaaa gttggagacc agcctgggca acataataag 63360 acctcatcta caaatcacaa gcaaaaggaa actggcaaat taaaaaatat gtgagactta 63420 aaacagcaga ctcttgacgg ctcaaaaaat ttaatattat taagatgtca atactactca 63480 cagtgaaaca caaattcaaa gtattttcta tcaaaatccc aatgttacga tttttttaga 63540 aatattattt taagtcctaa aattcttacg gaatatcaag ggacaatgag tagccaaaga 63600 agctttagag aacgaagtta gaggtgtcac acttcctgat ttccaaacag attacaaagc 63660 tatagaaata caaccagaaa gacaaataga tgatggaaca gaatagagaa cccagatata 63720 gaccatcatg aatatcatca gataatcttc aaacaagttg ccattaccaa acaacaggga 63780 aataacagac cctttaacaa atagtgttct aaaagtgaac atcaaaatgg aagaaaatga 63840 aattggactt ctgacttgaa ccatatacaa aaatatctta aataaattaa acaaatgtaa 63900 gtaagatagc tataaaactc ttaaaatatg aggtaaaaat catgacattt gtcttggtaa 63960 ttttttaaaa tatgacatta aaagcagaag taacaagaaa aaaagcagaa aaatgggact 64020 acctcaaatg tagtaagctt tctggacata aaggaaacat ttaatgtcac gtaagaaatg 64080 agaaaaaaat tacaaatgat atatttgata gaagttaata accagaatgt ataaacaact 64140 ttaaaactca acaaaaaaaa ctgaacaacc ctatttaaaa atgggcaaaa ctctcaacag 64200 atgtttctac aaaagagata tacaaatggc caagaggaat ttgaaaggat ggtcaaattc 64260 acgaatcttt agagaaatga aaagcaaaat cccaatgaga tattacttca cactcattag 64320 gatggccact atcaaacgag agaaaataat aaatattttc aaggatgtag ataaattgat 64380 atacttgtgc actgagtggt ggaaaaataa taatgcagcc attatgaaaa atagtacaga 64440 ggttcctcag atattaaaaa tggaattatt gtactatctt gttggaggtc aaaagaatga 64500 gtgttgtgac caactcatta taccactgga ggctatatga gcaaacagca aactgttctc 64560 atgaatgcag gatgttggca agctgacaac tgcatctgca accagaagga atgctgaggg 64620 cagtcatgcc ccaggcacag tgtttcttgt ggttatctat aggaacatct ggagcctgtt 64680 gtacaaagaa accaattatg tgagcctgtg ataaatcagg cagctgacta accattacct 64740 gcttcctgcc ctgttgattc tacctaatga atacaaaggg ctgtataagc tcagggccct 64800 tgttccctat aagcaaggag ccccctgacc ccttctttaa aacagatctt tttgtctttg 64860 tcttcatttc tgcgtttgtc cttcttcttc agtcctgaac tgacagccac aagtggcacc 64920 tgaacaggga cttgaacaaa gaaggtctgc tggagcagaa aaagtgaaac tgaccagatg 64980 aatgagaaac cctgggatga gtctgcctgc agaggatata aggtcagtgt cctaaagagg 65040 tactgggagt gggaagtttc tgaatcaggg taacgtgggg gcagagtttg tctgttgagg 65100 agcagcatta cgtgcagttg cttaaagttt tacgtaaaca atctggtgct caggttagtt 65160 ctcaaacgct gactaagctg ctgcaggagg ttatcatgca taacccctcg tttccgcaga 65220 caggcgctct tgatgtggaa aattggaact gagtagggga aggattaaaa tgggctcatc 65280 aaaaaggtct taaagtttat ccttcctttt ttttctgttt ggagtttagt ccatactgtc 65340 ctcctgccat tatctcattc ttattctgcc aaaccgcagg agccatgttc tgaatctcaa 65400 attttgaaag aatcttttgt cccacccaca acacccaaag aaaataataa acaggagagg 65460 gaggatgaaa attggcgtct accaccccct ccagtagcag aaacacctgt accatctcct 65520 tcagtaacag aaatagagac cccactgcaa agaattccgc ggactgctac catagctgga 65580 gagcccttag gacattgcac tttcactatt tctgtaaggc ctgatccaaa taatccacag 65640 cagtttattt atgaacatgc ctcactagag tttaagttgt tgaaggaatt aaaagctagt 65700 gtagtgaata atggagtaca gagcccattt actttaggat tgttagaatc tgtattttga 65760 actatgtgtc ttccatcctt tgatgtaaag catttggctc acacttgttt gtctgctagt 65820 gcatatctga aatggaattt aaattggcaa gaactgtgtg cacaccaggc tagcacagaa 65880 ttgtgctgcc gggcacaggg gacattacag aggatatgct gttgggtaat ggcccttatt 65940 cagacctgga atatcaaatg acactcccag acgctgctta taagcagcgt gcactggctg 66000 ctaaatgcac ctgggccaca attccagagg aaggggtccc aatacaatcc tttttacatg 66060 gcatgcaagg gtcaaaggag cactatgcac attttcttgc atgattacaa gaggcagtga 66120 ggcatcagat tcctcatacc actgctgcag aaatgctaac cttaacttta gcttttgaga 66180 atgcaaacac ggattataaa tgtgcactgg ctcctgtgag atgtactaaa aacttaggac 66240 attttctcaa aacttgtcaa gatgtgagaa ctgagcttca ttgctctaca atgttagctc 66300 aagcaatggc taatttagta gttaacaaat ctaaaaaggg ctaagggtca aaccctaaaa 66360 tgggaaaatc ttataattgt ggaaaaatcg gacatttcaa aaaggaatgc catcagacct 66420 tgggcaaaag ggatcttata atgcaatacc caaccttcag cagaaaaaat tccagaaatt 66480 tgcccttgtt gcaataaagg aaatcattgg actaatcaat gcagttcaaa atttcatcag 66540 aatggcaccc ctctgtcggg aaccaagaag ggagcctgga cccggtcccc tcaaacaatg 66600 agggcatttc ctgtccaggc cacaacccca tttcagggag gagtctatgg aggaacattg 66660 attccctttc cccaggaaca cccagaagca caggaaatag atctccctgt cagagaatgg 66720 gttacattag ttggaggaaa caaacccact aaaattccca ctggtatttg gagacctttg 66780 ccaacaggat atatgggatt aattttgggt aaaatccatc tcaacttaca ggacattact 66840 gtagtcccag gagttgttga ctgtgattat gaaggagaaa ttcaagtagt ggtaatatca 66900 caagatttgt tagtttttga acctggagaa tatgtaactc aactactgct tattccctgg 66960 gagttgtttc cttctccaca taaggagaaa tgagagaatc aaggatttgg gagtacagct 67020 aggaggaaaa tttatttatc acaacccata gcatctaata gacccacctg tacagtgcaa 67080 attaacggaa aaaaatttct atgggcttat ggatatggga gctgatgtgt cagtaatatc 67140 taaaaacaat tggcccccat cctggcccct gcaattaact cctacatcgc tagtgggaat 67200 aggaacagct caaagtgttc aacagagtgc tgaaatttta ccctgtctca aaccagatgg 67260 acagtcatgt acttttaaaa tttattttgc aaatgtaact gttaacctat ggggccaaga 67320 tttacttaca gcatgggata taagacttgc aaatgaaact attgacaatc cagggttcaa 67380 aatgttaaag aaaatgggat gtcaggcaga aaaggcttag aaaagtccct acagggaaac 67440 actgatccta tatcaatagc tgggcaaaca gatagaaaag ggctaggtca tcagaatttc 67500 tgatgggagt cactgatatt tctcccccat ctactgtttt accgctggag tggctgacta 67560 aaaaacctgt atgggtggat cagtggcccc tatcacagga gaaactaaca caattccatc 67620 agctagtaaa agagcaaatg gatgcaggac atattgaaga gtcagttagc acctggaatt 67680 catcagtatt tgtaattcct aaaaagtcag gaaaatgatg actgctacat gatttgagag 67740 ctattaatgc acacattaaa ccaatgggtg cattacagca aggtctgcca tccccggcag 67800 cctttccagg aggctggcct ctcaaagtaa tatatcttaa agatttttta tttatttttt 67860 tattttactg ttacatgagc aggataagca tcgatttgcc ttttatgtgc tttctgttaa 67920 tcaaaaagag cctgtctctc attatcaatg gaaagtctta ccccaaggca tgcttaacag 67980 cattatatca gcatgttgta ggataggcat taaaggtgcc tctgaatatg tttcccacag 68040 cctacatccg tcattatatg gatgatattc tttctgcccc tcctacagat caaattttat 68100 atcagttatt cagataaata aaatgagctt tgacttaaat ggaatctcaa aatagctcca 68160 gaaaaggtgc aaacaacctc ctgataccag tacttaggca ctattgttac tgaaagaagt 68220 gtttggcctc agaaagtagt cctccatagg gacagattac aaactttgaa tgatttccaa 68280 aaattattag gggacattaa ctggctgtgc ccaatgctag gtattcctgc ttatcaactc 68340 aaacaccttt atcagaccct tcaaggagat tctccattag actctcctca gcaacttact 68400 aaggaggcaa aagctaagtt acaacttgta gagctgatgt tttggcaacg acatgcctcc 68460 tggctacagc cacaaaaggc tttgcttctg tttattcttc ctacccccca ttcagcaaca 68520 agacttttag gccaattcat agtcaaatct gtagtagtat tagaatggct ttttttaaat 68580 ccaatcagac agtgaaatct ttgcaagttt atctttcttt aattactcaa cttataacaa 68640 taggtaggca tagatcaaaa atgcttatgg gatatgatcc agacaaaatt attgttccct 68700 tggattccca acaacacact gcagcatggg aaatgttgac tgcatggcaa attgctcttg 68760 cagatttcat aggaataata gataaccatt atccatcaga caaaattttg caattttata 68820 aagttcaccc ttttattctc cctgtaatca ctcatcacaa gcctattcca ggtgaacaga 68880 cctattttac tgatggttct gccaaaggac acgcagctat ttatggacct aacatactta 68940 gacaataaag acctctggag cttcagctca atgctcagaa ttaatgatag ttattcaggt 69000 tttacagctc accacttcat ctcctaataa cattgtttgt gattcagcct atgttgtaaa 69060 tgtagccagt cgtgctgaaa ctgccactat taagagcacc ctagaaccag agctgcttaa 69120 cttgtttcta agacttcaac aagctgttcg ctctcatgct actccttttc atatttctca 69180 tattcactct cacacgcaac ttcctggacc actatctcta ggtaatgata aagcagataa 69240 actaatcggt tctgtatttc aacaagccca agcttctcat gcattactgc atcaaaacac 69300 ctctgccctt actcgtatgt ttcatctgcc tcatggacag gctgcagcta ttgtgcaaac 69360 ctgccccact tgccagcatg ttcctggtgt tgcacttgtg gaaggatgta acccacgagg 69420 cttggcacca aatgaaatct ggcagatgga tgttacacat atagcagcct ttgggaaact 69480 cagctgtgtt cgtgtgacta tagacactcc catatgctac atgtcacatg ccaaacagga 69540 aacagctggc catgtccaac aacattgttt gtcatcattc gcccatatgg gggtccctaa 69600 acaattaaaa actgacaatg gacctgcttg tgttagtcat gcttttcaaa attttttaca 69660 gttgtgggca atcactcata acacaggaat ttcttacaat tctcgaggac aaggcattat 69720 agagtgggca catcaaacac tacagtgtat gttgtaaaaa caaaaagggg gaataggaga 69780 caagctacca cctcaaacaa aattacattt atccttattt acttttaatt ttttactttt 69840 gatatggata gtaagactct ggccaaacaa cattggcaaa tgttagaggg aaagaggaaa 69900 gtttacccaa aggtactatg gaaatcccca gaagaaggac aatggaaagg cctggtggat 69960 ttactgacgt ggggatgagg gtatgcttgt gtttttacag gagatggata aaccgtgtga 70020 gtgccctcaa gttgtgtgcg accatggaat gggagactgg agggatacat ggatcccaac 70080 tacaggccca gctcctccag tatgagccat gagccagttg aatctgaatg tgaagatgga 70140 atgaagaccg acgagagtca cactgacgtc aaccctcata acatggggtc agatcaagaa 70200 aaccacacca gaagctgaga aactggtgta gtgccagggt caggcaaaaa cccctgactc 70260 catgtttatg gccatgctag ctgtaatatc ctgtgcagta tgatttttct gtgcagaagc 70320 aaaaacatat tgggcatatt ttcctaaccc accggtagtg tgatcatact ctgaagcagc 70380 actcctcctg agatatatca tgatcaagga gcatcagtac caggacctct aactccccct 70440 gacacagagc aattagactc tcataacaat ggtatcaatt ataccactcc attggaggga 70500 cttcctttat gtgtcaccca ggatacattg ctcaactgca gttgccttgc agtttgatcc 70560 caagcatggt tgagttacca taaaaaaatt atgtacctat tagaccttag ctttattaat 70620 attacttgtg tagttactaa tcactcctgg ccccatcacc caaattgtac tgattataca 70680 gaatgggctc cctttgataa ttctcacccc cctccttggg cccactgtct tggcccctta 70740 gctagacaat agtccatgtt aatgggagac attattgact ggggtccctg tggtcattaa 70800 gatgggagag atgagaatca gaccacatgg cataaacttc actggcactg gtggcgaaac 70860 tttaacatct cttcacttca acacactggg attcaatccc aatctgccat gcaacttgct 70920 tggcatggaa cgggctttag cccacctttg cctcaatggc attatcaagg aaagagaggt 70980 ccaattcagg agtctatgtg gaaggcagca ctcccatata tgaatggcag catttgggtt 71040 gggacactat ccaataatag taatagtgct caatacagtt taatgttacc tttgtaaaaa 71100 atgtttgaaa tttgtgtttt taatccctat gtttttctag cagcaaaaaa ggaccaactc 71160 caggtaaaca atgcccaatt gaattgtgat tcctgtcaac tctatcattg ccttaatcat 71220 agcacaatac aaacacacag catatccacc ctaataattc taggtcgcat tcctggatta 71280 tggattcctg taaatctatc tgagccttgg gcagccaccc ccactttaca ttttgtaaaa 71340 cttcttactc agcttactca tggcactcgt agagccttag gcatgataat ttttactata 71400 gtctccttaa ttacattaat accctctgtt gtggtgtcct cagtagcact ggacagctcc 71460 actcaaacag ctcaatatgc agaaaattgg atgcatacag ctgaccaggc atggatgttt 71520 caaaataaaa ctaacactga gatacaaaca gaagtggcaa tgttaaagac tactgttctg 71580 tggctagaag aacaagtaca aagcttgcag ttgcagtagc aattgcgttg tcattttaac 71640 catactcata tttgtgtaac caattaggaa tataatcaaa gtgaatatcc atggaacctt 71700 gtaaaggccc atttacaggg agctgttaca tccaatgtta cttttgatat taatgattta 71760 caaagtaaaa ttctaacagc acctcaatat ctttttcata attattggaa taatgttact 71820 atgtttctgt tttttgttca tagtctgtaa aatcaactgg aacaccaacc agcaattgag 71880 agctgaacag cctgcaatta cctttattca attaaatcaa aagcagaaag ggggagatgt 71940 tggaggctga aagaatgagg gtcatgacca actcagtata ccactggagg ctatgtgagc 72000 aaacagcaaa ctgttctcat gaatacagga tattggcaag ctgacagctg catctgccac 72060 cagaaggaat gctgaggaca gtcatgcatc aggcacagtg ttccttgtag ttatctatag 72120 gaacatctgg accctgttgt ataaagaaag caattatttg agcctgtgat aaatcaagca 72180 gctgactaaa actgttacct cttcctccct gttgattcta cctaatacat gtgaagggct 72240 gtataagctc agggcccttg ttccctagaa gcaaggagcc ccctgacccc ttctttacaa 72300 caaatctttt tgtttttgtc ttcatttctg cattcatcct ccttcgttca gtcccgaacc 72360 gacagccaca tgatctggca atcccatttc tggatatcta tataaatgtt caaagcagga 72420 cctgaaagaa acatttcaca cccctgttta taagagattt attctaaaaa tccaaaaggt 72480 agaagctact tgaatgtccc ttgacagata aataaaataa aataaaatat gatatataca 72540 tataatatga tttaaaaaga aaatcttggg ctgggtgtgg tggctcatgc ctgtaattct 72600 agcactttgg gaggccgagg tgggcagatc acgaggtcag gagattgaga ccatcctggc 72660 taacacggtg aaaccccatc tctactaaaa atacaaaaaa ttagccaggc atggtggcag 72720 gtgcctgtgg tcccagctac tcaggaggct gaggcaggag aatgatgtga acccaggagg 72780 tggagcttgc agtaaccgga gattgcacaa ctgcactcca gcctgggcga cagagtgaga 72840 ctgtctcaaa aaaaaataaa taaaataaat aaataaataa atcacacact gcaatgacag 72900 taaaccttta ggacataatg ttaagtgaaa tgtgccagga aacaaagtga cagtgagtgt 72960 atgattcctc ttatgatata tcttaagtag tccaactcac agaaacagaa agtagaatgt 73020 caaaggctca ggagagggta aaatggtcgg ttgacgttta tggctattga gttttagttt 73080 tgcaatggaa aagctctaga agcctgttgc ataacaatgt ggatatatgt aacactacta 73140 aattatgcaa ttacaaaggt atagactggt aaattttgtt gtgctttatt acaattaaaa 73200 tattgtaaag tgatacataa aagagataca gagttataaa cttttcagaa aattaccttc 73260 aaattataag cgtgtttttc tcacacaaag ataatataga ttcatcaaaa aatacatggg 73320 caaattaaga ctatttacat gactactctc ctgaacaagt taaaacaaac ttttgacatc 73380 agccaagaag agaaatatgc aagataagaa taaatggagt atatttatag aggcaaacaa 73440 acacatgatt ttattggtgg tagatatgac tgattcatat tttaattaaa ccccacatcg 73500 actcgatgtg tacatagagt tgcagattta catccaaaat cataatatgt aggtaaaacc 73560 aaattcacaa aacacaaatg tcaaagaagc taccccaaaa aagaagcaca gtaatatgaa 73620 atttcaaaac aagaatgaga gaaacattaa caacaacaac aacaacaaaa acacttctat 73680 ataaaacata gatgtacaat taagaaagaa tccgcttaga taattacaat ttcccgctgt 73740 gaccttgcac tggtggtgag cacagatttt gaatcatgac tatgttaggg agacgcccaa 73800 ggagacagac agtgccacct ccagagaagc cattgcttct cctcctgccg ctgctgctgc 73860 tgcccccacc gtccgctgcg cctgcagccc ccactgagcg tcggactcct tcctggagta 73920 gggaggtcct gttccttctg gagcgacaga caccctttct cctggccttc tcgcttacta 73980 gcccggcagg tgctggacag gagatctgag ctggtcctgc gtctctgagg agctaggagc 74040 ccggctggga gaacaaggag acgaactgtg gggagaaggg gcgacaggaa cgccaggctc 74100 atgggaccgc tggcagcggc ctgggtatgg ctggcggctg aatggtcaga gatacgagag 74160 gtggccactg tccccacctt tggcccccta gccggcattc gtacattctg tgctcaacaa 74220 acggaagcgg cagctggagc tgctgctccg ggaggtggag tggcctggca gagggcacat 74280 ggctgccacc tgctgcaagg tgagctggtc tgcagcctgg gcccacaaaa ggccgctctt 74340 gtgcaggaca caccgctgcc cttgaccctc ttgctccccc gcctgctgtg caaaatgctc 74400 aggtccctga tctcgggctt tcctggcaag tgcactgtgg tggggaggca gcagggagga 74460 gggcttttcc aggagccctg aacagaggat cttggcataa agaggagaga gaggtggctg 74520 actggttcca cttgtaggta ggggggcaac aaaccccatg ggaccctgtt ttttcaggga 74580 gatttcagtt cacttcttat cttttctcca cccacttgag cctctgagaa tagaggagac 74640 gaggctgttt taaattggcc taaccataat ggtctggacc cttgccccag ggcagaccta 74700 attttggggc tctttgcagc atggaggctc acgcctgtcc accccaggtg tcttcaatat 74760 agggtctagt taggcctggc tggcagtgat gctgagacgc agcacgacct ggccagatct 74820 tcgcctgtta caggacatta tagccttcag tgccctgtgc catttatctc gcctccagaa 74880 gcccctgtga gcctcagtgt tgccggtgcc caggccctgg ctgcctctct attagggtcc 74940 catcttatgc ctcctaaatg caccggggtc tcactctgcc tttctccctt tcccagaaca 75000 ggccccttca actccaacag aacatgcctg gaccatgtgc atccctcttc agtgttaaaa 75060 caaagaaaat ttattttttt tccactgaac atgtaactga tttaacatat aaggaggtca 75120 gctttatgca tagatctatg catgtaaata tatacaaaaa ttctaacgct gtgggaaaat 75180 taacatcctt acactttgtt cagttatttt atagttttct ctctctcact caattgcttt 75240 tttttttttt ttttttgaga cagagtcttg cactgttgcc caggctagag tacagtggca 75300 aaatctcagc acactgcagc ctttgcctcc tggattcagg agattctcat gcttcagcca 75360 cctgagtagc tggaattaca ggcatgggtc accatgccca gctgtttcat gtgttttttt 75420 ttagtcgaga ccgggttttg ccatattgcc caggctggtc tcgaactcct ggtctcaact 75480 gatctaccct ccttggcctt gcaaaatgct gggattactg gcatgagcca ccatgcccag 75540 cctacctgtc actatctcta tgtttatttg ttcaacagga aaattctcag tgaagactcc 75600 tcagtatgaa ggagataagc ctgcacaatc agtcactgat agatgcttag tggaaaaact 75660 tccaattccc atttacagct ctcagagcta ggattaaaaa ctcctggtca taaactcatg 75720 tgatgagaag ttatagcacg ccctcatttt ctacatatcc acttgcattt atggttggct 75780 tttgaacttg ctagaaggga aagaagtgca aatgtgtcct ccttagagct actctcctcc 75840 ccttggtggg tttccagttt gtgcattgtc cagatggccc aggagctgac gatcaaaggg 75900 aagaagtcat gtttgtcatg agaatgcttt gctgcatcag gattcagtga agctgttcac 75960 cgcctggagc ccatgcagcc tcaagaggca ggatggagct cagaaaccat cactgaggtt 76020 agaaagtgag caccaaagtt gagggaagcc cacaggagtg agccgaagtg ctccctttgg 76080 atttccaagt ggttgctgct gcttcttcca tcagccttgc ttctgaccac aatgtgttcc 76140 tggtgccttc ttcttggcat tttgctgttt gtgtccaagg aaaatagtcc tgcatggcag 76200 tggtgagaag gatggctgcc tgctgaagct gatttgctgg taagctttgc agcctgttaa 76260 gcagagcctg aaattctttc tcactgagtg gtgattcaaa ccttggaggg tcctcccctt 76320 gtggatcggc attcagaaaa gattgtgcct tttcctgaaa ctctggggat tataggagat 76380 catgcagatg gctggtgtgg ttgctccagc aggtggatgt ctccttcgtg gcttgtgtct 76440 gtttctgtca caggggagac tcagtgtgca tgggctgcta aggtgctcct gcttcaggtt 76500 gaatcatgac atcctaggac cccttgcccc ggctccacca ctgctgggat ttggcctgtt 76560 gacccaaact ctaaaattgt ggctaagaat gaccaggctg aggctgtcct ttaaggcaga 76620 acttccaggt tagtgtctca tttttcttat cctgaaattt tcctttcgac agaggccaga 76680 agcaagtctg tgtatgggag agcctccctc ctagagctgg taccattgac acatgactcc 76740 tgagtgccag aagaggttga gagaactctc ccatctgcac agcctgtctc atgcagaatg 76800 cagatggatc caaaaaatca caggatgtgg gaggtgaagg aagagcttgt aaaaatgaaa 76860 agtggctggt gaaagagtag gaggcatgaa gaggatgaga cctgcctggg gcagtgcaca 76920 tgttttgttc cagccaaaca atcagatgag gtcttggtct tggacctggt gccagggaat 76980 tcataagccc ccttttgctg tggcctggga gctgaggtct ttggttctga aaccaaatgt 77040 aaattttgga ctctggaata cctgtctgtt taaccagttt ctctctactg gtgatcgcag 77100 gaaatgaatt atcctgtaaa attttgtgga ttcttctcaa aggcttcaat gagtacactg 77160 atttgtggtt cagtgatgga tgtccctttc ctcctgcctt cttatttgac ttacaccatc 77220 aatattaact atggcagtta tggtaatatc attctttaca acagaggaaa cttcaagtca 77280 ttcattgatg atatcaaagc ctcattctcc tacattaaac attctcctac atttagcttc 77340 ttgaatcttc tatgtccact gtctagaaaa cctaaacaca tgaagtacca caaactagaa 77400 ataaatacct caataggcac caatcataaa agtaaatcca agaaggaaca gaatatatga 77460 atacatatat aacaagtaaa gggattcaat taattaaaaa tcatcacaca caaaaaagcc 77520 cagagtcatg tggcttaact gataaattct actaaacatt tagtgaagaa ttaatgccaa 77580 ctcttcacaa ggccttccag aaaatagaag acagttattg ggaacacttc ccaatttgtt 77640 ctatcaggcc agtattaccc tgatactaaa gccagacaaa agcatcacaa gtaaatatga 77700 acatagatga atttccctga taaatacaca aacagaaaat ctcaaaaaag agtgaaatga 77760 atcaagaata aatcaaaatg actgtacacc atgaccaaat ggaattattt cacaaatgca 77820 atattgatct atccaataat caatcaatgc attacacaaa gtaataggat aaaggaaatt 77880 aacagaaagg tcctttcaac agacacaggg agcatttaac caatccaata ttcattcacg 77940 atctccctgg aaaagaggag tacaagaaac ttcctagatc tgctaaaagg cgtcaatgta 78000 aaacttacag ctaaccccat aataataaaa tactggttgt tgtgtcttga catttgagaa 78060 caagacaaaa atgttaacat ccaaataaat tacataagaa aaataaacaa aatcatcaat 78120 gtgggaaaat aagaggttag aactctctat cattgcagag gacataatgt gaatataaaa 78180 gtttataaga aattcattaa aactcactac aaccaataag tgagttcagc aacatcacaa 78240 gatacaaaac caatatacaa agttcaatcg tacttttatg tactaacaat gatcaacctg 78300 aaaataaaat taagaaaaca attccatttg tgtatgtatg aaaaggaaaa aaaatattta 78360 ggagtagatt taatcaattg caattttaca gtaaaaagaa aaactgttaa ataactttta 78420 aaaactgaaa aaaataggca tcaaggttta tttagttaat attattccat tattcagtgt 78480 atccatttaa actccataca ataaaataaa gtatatttta gcattatttt tagtgtaata 78540 aatacaggtt ggtagaaaac tcataaatat aaaaggaact aatactacat gtgcagagta 78600 gcccattcta gtttttctat tttacatcta ctgtaaatca aacagatttt cttccatatt 78660 ttatagtgga tcttaatata aaatccaaat tgttaaatat gtgagattga ttataagctt 78720 gccaaggagg tttttactca gtgtgggaat ttggagagca taaagctgca aagacagaag 78780 caaatgtttt tgaatgaatt atgagggaca atactcacag gagtgattcc cacttttatc 78840 agttgactca tgtgatctta ctttaggaga aactgactct cattttagac attggttcat 78900 cacagatgct aaatgaacca gcaccaagtt tatatccagg agaactgctc actgtagagg 78960 attttgtctc ctagctgatg acatgttctg tctttcatag gctactccaa aacttttata 79020 atgatggtag agttctgtaa agtggagcct caggcctgct atccccatgc tcctggccag 79080 cagcagcatc tcccctgagc atggtgacat agggcatgcc attgacaccc aatttgctgg 79140 tgttcacctc cacatactaa gttgctagag aatttgcagt ggatttccat tttgctgcat 79200 ctggctgtcc agaagccctg caagtagaat gggatggtca ggagaaaaca ttgaaagaat 79260 agatgggagt tcagaggctg cccaccctcc tgccctctgc ccacgggcca cagccctcac 79320 ccagctgtcc agtgtgtatg tctgctaaag gctgctgcac ttgttctcca tcacagaggt 79380 ggggtgacag cagtctgtgg gcaccacact ccatgatcag cctctactgt tggtgccaca 79440 gctccaggta gaagggcagg tgagccaaag acaggtccca ccttccacat ccagcccaca 79500 ttcccaccaa cttccaggcc cacctccata tgctgatgtg atgctctcct tagagctctt 79560 gtggttcttc agccaggaga tggagagagt ggggtttcaa gagtaaggca gtgaaagttg 79620 gtttgggtgg ccaccatggt tggtggtttc ctgtccatcc actcggtcca agtccagtaa 79680 ggggcctctg ctggtggaag cacacatgaa ggccatagct ggggtgagga gcagagactt 79740 atctcaccag acccccaatt caaggttgcc gctgcccacc cacaccctct cttttccctc 79800 ctgtgtaggg agattgttgg cctttcaaac ccctcttcct gggctgaata aggattccag 79860 gagcttcaac ataatgtccc cacccagtca tgctcagagc tgggccatgt gtcccctccc 79920 attcccttca cttcccacaa gtggctgctc ctgctgagag gttggggtgc ttcatcctgg 79980 cctaaaaacc tcaaagaata atggagtctc caaaggagcc cccacccacc cagggaggct 80040 gacagggagg gtccaccagg aagggagccc agcaggtagc ccagctaagt gaatgagcca 80100 gggtaggcat tgggagcagt ttaccaggag aagaaactca gccccttgca gagcggggag 80160 cctcagaagc agcagagaag ccttgcccca caagactctg agtccctaag ccaccccctg 80220 tagagctcca gggcactggt gaggatggcc tcctggaggc ctcagctctt ttttgtgcta 80280 atgtccaggg ctgtcaccat tccgcccccg cccccctcac agctgaacta tttttcttct 80340 ttctggggct ggggtggggc tgccttcctg cctggattca tgggtaggct ggattgcctt 80400 acccccagga gagaggcacc aggggcccag aaaagaggaa ggaggcagcc ccttccccac 80460 agtgacctcc tcgcaattca catgcagcaa caggcccacc tctcagagta agatgctgag 80520 gcacaggtag gagctgtgta gagagacact gggaagaggc acctctcaca gctctgagct 80580 gacctccagc cctccaggga tgggggaagg tagacccatt ggtgatgaag acagctcagt 80640 gagctgtggc agaggaagtt cccaggactg ggagacgaca gtgactcaac gctgctcatt 80700 tctagactgt gctttctgaa agtggccctt cagttacccc cacagcttga ggccacacaa 80760 gcctgggatg gccagttagg cagacgcaag cagggattca gggggagtgc actagggtgt 80820 gtgggcaggg gcagaggcca ttgaggcagg tgaggagaaa ttttcatcct cttcctggtc 80880 tgcccctctc ctggggtcta atttcctcta ttccgtctgc ctctggctcc ctggctggct 80940 cctcttcact ctcttgcctg ctcaccccag aggtcccagg ggctcagccc accacaaatg 81000 gtccccaagt tgtagctgac ccttccatgt ccatcccatg aggaccctca tctgcctgag 81060 tatatctctg ggctcctctg aaaccagaag tcccacctca ctgactgctc catggctagg 81120 cagcatccac ctgccactgt tccaggccag aatgactggg cattgtcccc ctgcctgctc 81180 cctcacacct acagctcatg cccccaaaat gctgttggca tccttcatac aaccctcaca 81240 gccaccctgc cccctgctgg gctgtaggtt tggtctcctg gtgccttaac ccctttgtcc 81300 atctgcccct gggcagccac ctgagcccct gagcactgct gtgctcacat gtgcagtagc 81360 cccctcaccc agagccagca tcgaagtctc cacaggccaa ttctggcctc atcactgctc 81420 ctggaacccc agggccctgt gccccaatct ccccatctgc agcatgggtg tctcttccta 81480 ccccaagcct gccccccaga gctcaagaca tccagagcca tctaatacat atgtaataca 81540 tacaaattac ataacaattt gtaatatgtt gtactacata cacattttca tatgaattca 81600 tcataatacg tacaaattat gatgtcataa tatattgtga tgtgacaata cacatgaatt 81660 atcatgtcat aatacattgt gatgtcataa cacatactaa ttatgatgtc atgatatatt 81720 gtgatgttat aatgcatatg aattatggtg tcacaataca tatgaattat gatgtcatga 81780 tacattgtga cgtaataaga attgtgacat cataatatat catgatgtca tatgcatgca 81840 acttatgatg tcatgatata ctgcaatgcc ttaatacaaa ccaattatga tacagtaata 81900 tgttgtgatg tcataatatc atatttattt atcatattta tcatataaca ttttgtcaga 81960 tatttttata agaaaattga gtgaaatttt gtaacattaa catataaaga agcttacaat 82020 catgatgaaa gatgaaagag gagcaggcat ctcacgtggt aggagtggga acaggaaaga 82080 tgggggagag atacgcctca cttttaaacc accagatctt gtgtgtactc actgtgacaa 82140 tgacagcaca gagccatgag aaatccatct ctataattca tccacctctc accaggcccc 82200 acctgtaaca tcagggatta aaattcaata tgagatttgg aggggaaatc taaactatat 82260 catatgacga ttagaaaaac agatgaggtc cttcacgtct ctttgaagca attgtgaatg 82320 ggagttcact catgatttgg ctctctgtct gttattggtg cataagaatg cttgtgattt 82380 ttgtacattg attttgtatc ctgagacttt gctgaagttg cttatcagct taaggagatt 82440 ttgggctgag acagtggggt tttctagata tacaatcatg tcatctgcaa acagggacaa 82500 ttttacttcc tcttttccta attgaatacc ctttatttcc ttctcctgcc taattgccct 82560 ggccagaact tccaacacta ggttgaatag gagtggtgag agagggcatc cctgtcttgt 82620 gcccgttttc aaagggaatg cttccagttt ttgcccattc agtatgatat tggctgtggg 82680 tttgtcatag atagctctta ttattttgag atacgtccca tcaataccta atttattgag 82740 agtttttagc atgaagggtt gttgaatttt gtcaaaggcc ttttctgcat ctattgagat 82800 aatgatgtgg tttttgtctt tggttctgtt tatatgctgg attacattta ttgatttgcg 82860 tatattgaac cagccttgca tcccagggat gaagcccact tgatcatggt ggataagctt 82920 cttgatgtgc tgctggattc ggtttgccag tattttattg aggatttttg catnnnnnnn 82980 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 83040 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnttttagg gcaggcctgg tggtgacaaa 83100 atctctcaga atttgcttgt ctgtaaagta ttttatttct ccttcactta tgaagcttag 83160 tttggctgga tatgaaattc tgggttgaaa attcttttct ttaataatgt tgaatattgg 83220 cccccactct cttctggctt gtagagtttc tgccaagaga tccgctgtta gtgtgatggg 83280 cttccctttg tgggtaaccc gacctttctc tctggctgcc cttaacattt tttccttcat 83340 ttcaactttg gtgaatctga caattatgtg tcttggagtt gctcttctcg aggagtatct 83400 ttgtggcgtt ctctgtattt cctgaatctg aatgttggcc tgccttgcta gattggggaa 83460 gttctcctgg ataatatcct gcagagtgtt ttccaacttg gttccattct ccccgtcact 83520 ttcaggtaca ccaatcagaa gtagatttgg tcttttcaca tagtcccata tttcttggag 83580 gctttgtttg tttcttttta ttcttttttc tctaaacttc ccttctcgct tcatttcatt 83640 catttgatct tccatcactg atgccctttc ttccagttgg atcgtatcag ctcctgaggc 83700 ttgtgctttc ttcacgtagt tctcgagcct tggctttcag ctccatcagc tcctttaagc 83760 acttctctgt agtggttatt ctagttatac attcctctaa gtttttttca aagttttcaa 83820 cttctttgtc tttggtttga atttcctcct gtagctcgga gtaagtagtt tgatcatctg 83880 aagccttctt ctctcaactc atgtggagaa ataggaacac ttttacactg ttggtgggac 83940 tgtaaactag ttcaaccatt gtggaagtca gtgtggcgat tcctcaggga tctagaacta 84000 gaaataccat ttgacccagc catcccatta ctgggtatat acccaaagga ctataaatca 84060 tgctgctata aagacacatg cacacgtatg tttattgcgg cattattcac aatagcaaag 84120 acttggaacc aacccaaatg tccaacaatg atagactgga ttaagaaaat gtggcacata 84180 tacaccatgg aatactatgc agccataaaa aatgatgagt tcatgtcctt tgtagggaca 84240 tggatgaaat tggaaatcat cattctcagt aaactatcgc aagaacaaaa aaccaaacac 84300 cgcatgttct cactcatagg tgggaattga acaatgagaa cacatggaca caggaagggg 84360 aacatcacac tctggggact gttgtggggt ggggggagga gggagggata gcattgggag 84420 atatacctaa tgctaaatga tgagttaatg ggtgcagcac accagcatgg cacatgtata 84480 catatgtaac taacctgcac attgtgcaca tgtaccctaa aacttaaagt ataataataa 84540 taaaaaaaag aaagaaaaac agatgaggct gggtgcagtg gctcacgcct gtaaactcag 84600 caccttagta atccgaggtg ggcagatcac aaggtcagga gattgaagcc atccagtcta 84660 acacggtgaa accaccccca tctctactaa aaatacaaaa aaattaactg ggtgtggtgg 84720 cactcagcag tagtcccagc tacttgggag gctgaggcag gagaatcgct taaaccctag 84780 agacaggtat tgcagtgagc tgaggtcata ccactgcact ccagcctggg caaacagagt 84840 gagaatccat ctcaaaaaaa aaaaaaaaaa aagaaaaaca aaaaaagaaa agaaaaacag 84900 atgaaatagt aatggtttct acagatactt tcattacaaa gcatatggat aaatgattta 84960 attcaacatt ctgtggtcag aagagagaag ggaggtgtaa aggggacttt gactgcattt 85020 gttatacttc ccttatgctg ttgttatgag ttttgatgtc accacctgaa ggggtattca 85080 tggacggaag aattattgct attgttgtga ttattccttt ttcttttgta ttagtaaaat 85140 aaatctttta gcttctcata taattttctt taaaagccct aagagtttcg gttaaattct 85200 tcgttattgt gtgttataaa aattgacagg gaaatggcta aaataggtta aaattacaca 85260 aactctagga gtccagtttc tgttagacag gcttaggaga gacagaactg gaaacactcc 85320 accatcatag acatcagaca catggggctc actttctgtc ccagccctgc ccagatccac 85380 cctcttctaa ggccttatcc aggcctggcc tcaccctaga atctcctctc acagaaggag 85440 atcaaaatta aaggagatca aaaatttgag tggtggctcc tcctgcctct ccttagctga 85500 tgcccacaat ttcctgaaaa taaaaagcag ataaatggga gcaaatgatt atctatttgt 85560 gggtcacaat ttctttttca ttgaagccag tgcttgtaga gacatcccac ctagcaaact 85620 gtttttctgc cgatccggta gatgctctac aaagcacaag aaagtcaata taaataccaa 85680 aaatccctct gaacagttca ccctttctgt atcccttcca tctgtctata tagattttat 85740 tctacacatt ttctttttaa ggtaagtaat tttaaaaatg aaagaaaaat agaaatgctg 85800 ggcccttcat ttaaagccta ggaattacag aacacttagc agccaactac cagggttgag 85860 gattcactca catcaggtga tgtttgcagc acagtgctgt gtaacagact tctgaacaca 85920 tagtacacac tcagtaaaca ttgtattaac tcatggatac atgtttttca aatgcagact 85980 tactcaacca ttgatgcctt ctctaggctc tataaccttt aaagagccag cagagaataa 86040 catgtttgta tatagtgatt ggtggtttcc actttgggcc aaaagggttt gtgttgtggt 86100 aagagtgttg ggtgtcagga actctctgtg ctgttcctac tttctctgag tgctactaca 86160 gtggatatag tgggagaatc aacagtctcc ccttaagggg agactttaag aagccaattc 86220 atggacccct tccaaacttg cagaatcaca ttactaagac agaggcctgg aatcagaaat 86280 gatttatatg tacattgaaa cttgagaggc aattctttgc taagtggctc tcagcctagg 86340 tgtccaatca taatcacatg accactttaa agaaatacca tcacctgtgc tctccccaca 86400 ggttctgtct gttgagcttg gtgggctcat acatatagtt ttaattagga agtcaaattg 86460 tccctctttg ttgattacat aatattatat ctagaaaaat ctaaagacca ccaaaaactt 86520 ttagatttga taaataaatt taataatgtt tcaggataca aaaatcaatg tagaaaaagt 86580 agtagcattt tcatacacta ataatgatca agctgagaac caaattaaaa ggtcagtttt 86640 ttttacaata gctacaaaaa agtaaaacac atagaaatat aattacttaa gtaggtgaaa 86700 gatctctaca aggaaaacta caaaactctg atgaaataaa ttgtatatga cacaaacaaa 86760 tgagaaaaac atcacatgtt cattgattgg aagaagtatt atcattaaaa tgaccatact 86820 gccccaacag tctacatatt aagtgtaatt cctacaaaaa ttctaatgtt agtttttata 86880 gaattagaaa aaaattatat ttatatggaa ccatagaaaa gcctcaatag ccaaagcaaa 86940 tttgatcaaa gacaacgaag ttggacatat tacattacgt gacttaaaat tattctagaa 87000 ggctttaata accaaaacag catggtaatg atgtaaacag atgcacagat caatggagca 87060 gtatagagaa cctagaaata aagccatata cctacacaca actgatcttt tccaaagtca 87120 acaaaaacac acacagaaaa atgacatttt attcaacata ttgtgctgga aaaattacgt 87180 tactatatgc agaagtatga aatggaaccc ctaactctca ccatatacaa aaatcaactc 87240 aatatggatt aaaaggctaa aatgtaagac ctgaaaagat aaaaattcta gaagaaaccc 87300 tatgataaac tattctggac attggcctag acaaataatt catgactaag atctcaaaag 87360 tagatgcaac aataacaaaa atagacaaat ggaacttaat taaactgaaa aagctcctga 87420 aaagaagctt ttatttaata ggtgagcaga caagctatgg aatacaaaaa aaatgtttgc 87480 caactatgca tgtgacaaag aactaatgtc tagaatatat aaagaaatca aacatctcaa 87540 caagaataaa acaagaaact tcattaaaaa gcaggcacat aacgggaaaa gatatttttc 87600 aaaagaagac aatgatggcc aataagcatg taaaaaatgc tcaacattgc caatgatcag 87660 agaaatgcca attaaaaacg acagtgaaat accattttac accattcata atggctaatt 87720 attaaaaagc agaaaaatga tagatattgg taaggatacc gagaaaagag aatacttata 87780 cattgtttgt gggaatgtaa ctttctacag cctctatggg aaacagtatg gagatttctc 87840 aaaaaactaa aaaatagaac ttccatttga tccagctatc ccactactgc gtatctaccc 87900 aaaggaaaat aattcactac ataaagaaga tacccacact catatgttta ttgcaggatt 87960 attcacaata gcaaagatat ggagtcaatt taaatttatc tatcaatgat tgaataaaca 88020 aaatttgcta tacatttata ccatggaaaa ctactcagac ataaagaata aaattatgtc 88080 ttttgcagca ccatgaatgg aactggaggc cattattgta ggtgaaataa ctcagaaaca 88140 gaaaatcaaa tactgcattt tcttacctat atatggaagc tcaataatgc atacacttgg 88200 atatagagac tggaaaaata gacactggag actcagaaag ataggaggtt ggtagagggg 88260 ttagaaatga gaaaaacacc taactgggac atgagcaccg ttcaggtgat tgttacaccg 88320 aaagcacata cttcatcact ctgcaatatg tccctgtcat gaaacttcat ttgtactaac 88380 atattaataa agagaaaaaa actgactttt atcaagagag cagaatgaat agaccttcta 88440 cttttcataa atacttaggc agaaaaataa ttttaataaa aataaataaa aaatgtatat 88500 tatatatatt ttacataatt tatatattgt aaatatatat caatattttt atatatcaat 88560 atttgtatat ataaaatata tatcaatatt ttatatatat aaatatatat tatatattat 88620 atatataata tattatataa aatatatata atataaaata tattatatat tatatatatt 88680 tatatatata aaatatatat atatggggtc agggtctcac tctgtcaccc aggctggagt 88740 gcagtggcat gatttcagct cactggaaac tctgcctccc gggttaaagt gattctccgc 88800 ctgcctcagc ctcccgagta gctgggatta caggcgcccc ccaccatacc cagctaattt 88860 ttgtgtttat agtaaagtac agtgtttcac catgttggcc aggctggtct taaactcctg 88920 acctcaggtg atccatctga ctcagcctcc caaagtcctg ggattacagg catgagccat 88980 cacacctggc caataatatt gcaatataag aatggtataa aaagacactt tgatgagtta 89040 gagtagttct tagcgcagac aatatgcaag attctaagcc attagacatt tgtagacaga 89100 atatctaaca gtgtaaaata aataacacga agattcattg gcaatgagaa attgactttt 89160 ttcaatcata ttagatgaca ttaaaaccat tataaaattt actgttttgt acataataaa 89220 ggatcataat gttaaacaac ttcattaaaa gtttgacaaa ttaggcatat aaataggcag 89280 catgttgacc agtaaacaga aaatacactt ttcaaagacc aaacaaaatt atttttattt 89340 atttattcat ttatttattt attattggtg gatgagcagc tttattagct ggggatatag 89400 tggggtcctc tccctgggag gtggggtctt tcactggtca ctcccggcag tggtccagga 89460 ggcgccaggc agttcagtgc tgggcttagc tgggggctga gccttgaaga aggcgaaccg 89520 tgcagggaag tagtagctgt ggggtctcac ctcccgctcc gccctgctgc actgggtctc 89580 ctggtgctcc tcagggtccc gccgagcctg agtctctata ggacagtggc ccatccggcc 89640 cgaaaccttc tcctcagagc ccagtttgac gcaggccagg catttccact tccttccctt 89700 gggatggact tcgcactcgt gtttcttcca gtccttccgc cggccgcttg tctgcctgag 89760 cttaaattcc agtttcacaa atgttccagc tgggaaggac gtatccaccg cgccgtccac 89820 accgttctcc cggaaggccc atgccgggga gggtgcattc ctccagggct acctgcaggc 89880 cccggagctg ggccccggag ctcggacccg cctgcctccc cagcgccccc cgcgcccacc 89940 cacggggcca gcagcatccg cagccgtggc ttgcttctgc ggtctctcac tctggccctg 90000 cgaagctcct gtgcaccgct cagctctctg agcccgctgg gaggtgcctc ctcccctgct 90060 cttcccctgg gtggctatgc ccacagaact ctgggcagag gtcaaagagc caggaaatgt 90120 ctctttctcc aaattgactt tggtgtgcgc ctggttctct ccactccctc ctgccctgtc 90180 cacactgttc cctggggccc gcaggtttag caaagttccc tgccccctgc ctgggccagg 90240 aagcagtcct gttgcccact cccacccttc aaccctttta tggcccattc tctctcccca 90300 ctgggtctcc cacacgacaa cccctcctcc ctactgtccc cggagcccct ctctggttct 90360 cgcgctcagc ctcttccctg actgcttctc catctccatc ctgaatctcc cagctccagt 90420 agggtgcccc ccaatcccag ggcccaggca aaaccaacaa aattatttaa aatgggaata 90480 tttgaattcc actgaattcg taaaagcaga aacccatctg gttatatttt aaagaattat 90540 gaattaataa tagcaactat caatttaaag tgtaaccagg gttaagaata cccaccattt 90600 taacaacaac aacaatgaaa gcgtgttata gctaacaaac tagttctgga aagttgcagg 90660 atacaatatt aacatgaaaa tcagttgcat ttgtatacag taacaacaaa atatctgaaa 90720 aaggaataaa gaaaacaatt ccatttacaa tattatcaaa tagaatgaaa tacttaattc 90780 attaaataga atgagtttaa ccaagaaaat taaagatctg catactgggt cgggcgcggt 90840 ggctcatccg tgtaatccca gcactttggg acaccaaggc gggcggatca cgaggtcagg 90900 agttcgagac caggctgtcc acatggggaa accccgtctc tactaaaaat aaaaaaaaaa 90960 ttagtcggtg gtggtgatct cctgtaatct cagctactca ggaggctgag gcaggagaat 91020 cacttgaacc caggaggcac aggtttcagt gagccaagat tgggccactg cactccagcc 91080 tggatgacat agtgagattc catctcaaaa aaaaaaaaaa aaaaaaaaaa atctgcatac 91140 tgaacactat gaaatgttga tgaaagaagt agaagaatag gccagttgcg gtggctcacg 91200 cctgtaatcc cagcactttg ggaggccgag gtgggtggat cacgaggtca ggagattgag 91260 accatcctgg ctaaaacgat gaaaccccgt ctctactaaa aatacaaaaa attagccagg 91320 cgtggtggca cgcgcctgta gtcccagcta ctcgggaggc tgaggcagga gaactgcgtg 91380 aacctgggag gtggagcttg cagtgagctg agatggcgcc attgcactcc agcctgggtg 91440 accgagcgag actccgtcta aaaaaaaaaa aaaaaaaaaa aaaaaaaagc agcaagcaag 91500 caagcaagca agcaagaaag aagtacaaga atacgaaatg tgaaatatat cctgtgttca 91560 tggattctaa aaattaatat tgttaaaata tctatactgg acaaagtcat caacaaagtt 91620 aaagaaattt ctatcaaaat tttaatgcct ttaaaataag tgtagaacaa acaattctaa 91680 aattagtata gagccatgaa agaccccaaa taggcaaata ctgtgaagaa cagaaaggct 91740 gaatgcctca aacttcctga tttcaaactg tattacaaag ctatagtcat taaagtagta 91800 tagaacttac ataggaacca ctgaaacaga atagaggacc tagaaataaa ttcacccata 91860 tgcagtcaac tggtcctaca gaaccaggaa aagataggga catcaagaag tggtttagaa 91920 aaaactagat atgcacacac aaaaagtgaa actttcttat atcatcacaa aaaatgagtt 91980 taaaattaaa ggcttaaaca taataactga aatcacgagt cgtctttaaa aaatagggaa 92040 aaagctcctt caccccggtc gtggcaatga tgttttggat tctacacaaa gaacacaggc 92100 aacaaaagca aaaatttaaa aaatggaact atatcaaagt ttctgcatga taaaagaaag 92160 aatcaagaaa atataaagac aatatatggg atgggagaaa attttcgtaa accatatgta 92220 ggataatatg ttgctattca aaatatacag aatactaatc aatatgaaaa aagcatccac 92280 aggagaacaa aacaaaaacc aattccctga ttaattgggc aaaatattca tttttccaaa 92340 gacatacaaa tggccagcag gtatatgaaa agtttctcaa catcactaat tatcagtgta 92400 attaaaatca aaatcaaaat gagaaatcac cgtatcatgg tgttaggata actattatca 92460 aagtgtcaaa agaacaaagt gttagggtgc acagaaaaga caatatttgc acacagttgc 92520 agagcatgtc ctttggtgca gccattacaa aaaaaaaatc cagtatggag tttccttaaa 92580 atttttaaac ttgaagaacc attaatccca atttggagaa tatagccaat ggacataaaa 92640 tttaagtata gccgagggcg tttgcatgcc tctgatacaa atagatggat aaactgtaag 92700 acagagataa tttcagcttt aataaggaat gaaattgttt tagttacaaa aatattgatg 92760 aaccttgaag acatgatgct aagtgagatc agcaaaatac agaaaggcaa atattgcatg 92820 atctcatttg tatgtagata tattaaaaaa aagaaagaaa gcgggagaag ggtagtttgc 92880 atgggctagg aaatggggaa aggaggagat atatccattg aagggtgcat accttcagtt 92940 atataatgaa aaactgctgg ggacctaatg tgcagaatgg tgactatagt taataataac 93000 gtgtagttga aatctgttac taaagtagac ctgaggtggt ttcactacac acactgaaat 93060 gtataaagta actatgtgag gtgttagata ggttaaccag cttcactgag ataatttaaa 93120 gacgtataca catctcaaag cactctattg tataccctaa ataaatacac ttttcaatgt 93180 gtaaatgtgg aaaaaaatta acagtagtca gcaggtcaca tctagccaca tgtaaacttt 93240 atttaaaatt gttggccacc ctttctggct caacccggga acaaccggag cacttctggc 93300 cccttggact ttgacgctcc tcccactgtt cctgtactgg ggaagatatg gatagttccc 93360 aggggtgggt gggcgcagtg gctcacgcct gtaatcccag cattttggga ggccaaggtg 93420 ggcggatcac ttgaggtcag aggttgaagt ccagctagac caacatagtg aaaccctgtc 93480 tctactaaaa atagaaaaat tagctgggtc tggtagcctg cgcctgtaat cccagctact 93540 caagaggnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 93600 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnngtg tgaggcagga 93660 gtattgtttg cacctgggag gcagaggtcg cagtaagcag agatcgtgtc tgtgctctcc 93720 agtgtgggcc acagagtgag tctgcgcttt aaaaaaaaaa tataatatat atattatata 93780 ttatatataa tatatatata atatattata tatctaatat atatattata tataatatat 93840 attatatata tactatatat attatatata taatatatat ataatatata taatatatat 93900 agtatataat atatatatta tatatataat atatattata tataatatat aatatataac 93960 atatatatta tatgttatat atatatgtat gtatgtattt cccagggttt tttcacatcc 94020 cagactgtgc agatcaggga gggccatcag cagtcccttt actagttgcc cacagacccg 94080 tttggctcca ggcaatgcat atcagggcat ccaggtggga gccacagcag ccagcaggga 94140 ggaggctgcc tttggctgcc tgcagtctgg tggcctctgg ctccgcaggt gtctgtaagg 94200 ccccagcgca gcccctcctc acattgcctg tgaaggcaat gtaactgcag aacatactga 94260 ggaggaaatt gaggaaaaaa atgtccccca tatattagaa taattgtcat tagatcgctt 94320 atgctgcaga cagggtcact gtttatcatt actgctgtga aacccgctac aattggaaga 94380 gaaaatgatc tagggcatac cctttgattc ccataagttg gtgtcagtag gtttcatgca 94440 gaaatttatt ttgaccatga tttacaaagt tacgtccttc tagatcaagg cagtcaaaat 94500 tgaacaactg ttcatggaaa atgaattgtt cagctgaaaa ctaaatgaga cccttatgta 94560 cgtgagcata gacataaatt taaaattgga gagactgtgt caccctttca cattcaccct 94620 ggcagtgata gctgtgatgg ctgtgaacca tggcaggtta gaactcactt ttgctttgat 94680 aagaaagatg aatcttttgt tggtctacca ctaagtaaag aggaaaatga gttggaaata 94740 agtaaacaat taaagaaaat atgactaaaa tggggtttac agaatataga ttatgaacat 94800 gaatagacac tgaataatct aaaatataaa gatagagctg gaaatcatag tgagcagatt 94860 ggaagtgaag gaacattcta gagagatgtt gctcctgcat ctgttcattc tgaaattact 94920 gacagtaaca aatgtcagaa gatgttggag aagatgggtt ggaaaaaagg agacggcctg 94980 gggaaggatg gtgggtaatg aaatctccga tttagcttca gcttcagtga acatatggag 95040 gcttggggac aggcaatcca tcctcaattg aagacgttca gcttctctga aacaaaaaac 95100 aacctgggac aaagcgtgag agaggtttgc tgaaaatttc ccagaaacca aaccttaaaa 95160 agatgaccta ggtaccaggc cttggataaa aggaactgta cagtgaaggt taatcataga 95220 agaaaactca agcttttcaa aaaatagagt ttggaaactc ttattttatt atatatttgc 95280 agtacttttc tccccaaaag agtctgtggc acagaggaac agtgtcacag tttacccctt 95340 cttgattcag aaatgtgtaa taaagtttgg tttgcaaatt ttaaaaaaca ttttttaaac 95400 taataaatat tgactcaagt tattcagtaa gtggactaaa gtttacaggt taaagatgag 95460 tttatcaaac ttccttattt tatcttgtca tttatgacat ccatataagc aaaaagccac 95520 ataagcaaaa ctcataacca ctaatgactt aaatgtacat ttgtctgtgt gtccatgtat 95580 tcacagtaag gtgcacagca aaagaaacaa caaaagttta taaaaataaa tctgactaca 95640 tgcatcattg tttatgccct ttagaaacta gataaaagaa cctcttatac ttgaaatagc 95700 ctaaatatta ttgaagaaca aatgaacagc tgatatattg tagaaaatta gtcagtgttt 95760 tcttttttga agaatccgtt tattagtact atatcttcag tatttatatt ggtttgtttc 95820 atagctaatg tgatatttag atatgaacaa ctgagtacag tgttgaaata gtgtgctggc 95880 atttgtagtt ttcataaata ttattgcagg cagtggtgtt gtgccacaga aatctgattt 95940 ctagtacata aggattactt agccagggcc tcatgtttaa gatatttaat gaaaatgtct 96000 tcaactgcaa taaaaacatt gtaacattaa aaatactctt ttcaataaat tctaaattaa 96060 aaaattcaaa tgataccttt tatggagtta gggaagtgct aataaggtaa aatggcaact 96120 gaagccaaaa aatgtaaatt caggtaaata cttttaccct tattagttgt atgttaaagg 96180 aaacaataat aaacaataat tctgaccata tactatttac tgcagtaaag tatttgagaa 96240 atttgtgcat aatacataat taattttcta atggtataaa agtaatcaca ttctacaaat 96300 tattacaaca cggtctattg aagacagcgg catttcaagt gaagaatctt agagtttctc 96360 acgaggcagt gatatatgcc atatgaaatc taggtaaaat attttactca tgtatgcaac 96420 agttgatatt tctgctttcc aggaaaataa catgctttaa aaacttaatg caggtaacta 96480 aaatcaccaa gaagttataa gaactcacac aatgactaat atagttgaaa gaaaattatg 96540 aaaacttctc aggaccagaa gtaaataaag ggtataatcc agagaagtat agacctaaga 96600 ggccagtgct ccattcagat gcatctgatc acaaggacat cagatccaca tgggttgttc 96660 agcttctgac aaagcaaatg gaataaacaa agagaaactg cctgcaggca tccagagtgt 96720 gatgtctcct atatgtgaga gctaaactat aaattatgtt gcacacaagg agaagagcta 96780 ttagagtgaa gcattaaatt atatggcaca taaaactcag atatgaggat tataattaca 96840 tgtttgccaa tataatgaaa acacaagaaa ccgaaataat gtggaaaaaa ctatgagctt 96900 gtcaagtcat atttggaaaa gaggcgaatg gaaagtaaac ttgaaaattt ggtaaaacaa 96960 cttaatgtac agcatacaaa ttacacattg aaatagactg agatggtgag aggactagta 97020 aactggaatg ctgagcacaa ttaatgtagt aatatacttt ttagaaggca aattaataca 97080 aaatacaaat atatatgaaa attagatgag aagaaatgaa aaacatttaa ttgctttact 97140 atactatata aacaggaggg caatattcaa aaaaatcagt gactcataat tttcagaaat 97200 tggaaaccag gcatgaatcc tataaaagtt tagagtgtga tgtgtcagaa aagataaatt 97260 aagaaatact taaaataaat aatgattttg gttattgtga atactgcttc aataaacatg 97320 ggagtgcaat tatcttttgg acatactgat tttatttcct tttgatatat acccagtggt 97380 gagattgctg ggtcatatat atggtagttt tatttataac ttcttgaaaa acctccatgc 97440 tgtttttcct aatggtggta ccaatttaaa ttctcaccaa cagtgtataa gtgcctattt 97500 acagatgaat gaagaaaatg ttatacatac acacaggcac acatacacac acaggaataa 97560 tgctcagtca tagaacagaa tgaacttctg tcatttatgt gaacatggat gaacctagag 97620 gacattatgt taagtgaaat aagcccatca cagagagaaa aagaccacat gatgtcactc 97680 atatgcaaaa tctaaaaaag cagatctaat agaagtaaag agtagaatgg tggttactag 97740 aggctaggga gagtgatggg atgaaggaat ggggagaggt ttgtcaaagt tacaaagtta 97800 cagtcaggga gaataaattc tgatgttcta ttacacagta gggtgactat agctaatata 97860 atgtactatt tattataaca cagctagaag agaggttttt gaattttaat agcacaaata 97920 aattataaat gtttacactg ctagatatgc aaattaccct cattgggtca ttttacagtg 97980 taaacatgca ctgaaacatc acactgtagc ccataaatat gtacaattat tatgtgtcaa 98040 ttatacataa aataaatctt aaaaaataat aatgatcgaa ttgcataaaa tcaaaacgaa 98100 gaaaaactct taaaagtagt gagatagaag acagctaacc tgtgaagctg agggaaaaca 98160 cttacctctg tatcaagtca gaaaacagaa aaattaatat cagtaaaata ttaaaagctt 98220 ctggacagca cagaaatcta ttaatagagt gaaaagacaa catataaaat gaaaaaaaat 98280 atgtgcatct ggcagggagt taatatccaa aatatatata tgactcaact caacagcaat 98340 aaaaatcaaa aacccaacta aagagtgagc aaaaaacctg aatagacatt tttccaaaga 98400 agacacaaag aaatgctcaa catcacgaat catctggaaa atgcaatcca aaaccacatg 98460 aaatagtacc tcaccccatt tagaatggcc attatcaaga agacaaacga taacaaggtt 98520 tggatgagga tgtggagaaa agggaacact tatacactgt tggtgtgaat gtaaatttag 98580 taccttttta gattcattta tttattgata agaaacctgc atggaaaact atgcaggttt 98640 ctccaaacaa aaaacaaaaa cctaaaaata gaactaccat ataatccagc aattccatta 98700 ttgagtatat atcaaaagga aataaaatta gtatattgaa gagatagctg ctctctcatg 98760 tttatcatag ctctattgac aacagccaag atatggaatc aatcttgtgt ccatcaggga 98820 taaatggata aagcgaatat ggtgtataca cataatggaa tacgatttaa ccataggaaa 98880 aaaaaaatcc gccctttggg aggccgaggc gggcaaatca ggaggtcaag agatcgagac 98940 catcttggtc aaaatggtga aacccctttt ctactaaaaa tacaattagt tgggcgtggt 99000 ggcacgcgcc tgtagtccca gctactcagg agactgaggc aggagaatcg cttgaacccg 99060 ggagacggag gttgcagtga gccaagatcg cgtcactgca ctccagcctg gcgacagacg 99120 ttccgtttca aaagaaaaaa ataatattaa taaaaagaat aaaatccggc gctgcgcggt 99180 gacatcagtc tctgtcgtta atgcctcgcg ccggctaccg tcctgcgcag tctctttctg 99240 aggacccccc cccccactct ccgccttcca ataaggagtt caggttttgc ggtcgccgtg 99300 gttgctgttc ctgctgccac aggttggaac tggagatgcc tcttccttct ctcaggacag 99360 aaccatgagc ctagcggcag cgccggttcg cgaagctccc cctccgccaa cgggcgcctc 99420 ctcagagccg tccgtgcccg ccctgccggg agctgacccg cagcgcagtg cagagttgct 99480 cctgttggcg gtgaccaggg agggactgga gcggcggatc atctccagga agcgggctga 99540 gtaggaactg cagccgccac atcctctctt tacccgggga tgtgcaggat taccgtgaaa 99600 tcatgactcg tcatcctgcg aattaccaat gggaaaattg gagtctagaa attattgcct 99660 ccattttagc ccaccggttc cccagtagct gtattggggt gatgaagtgc tccggaacaa 99720 atgcgctgcc atgatagttt tctgaaaagt aacatgtttg gtttcccaga acacaataca 99780 gactctggag cttttaagca cctttatatg ttattagtta atgcttttaa gtcagagtag 99840 tttatcaaag gaaaatttga atgattggaa taaggactcc acagcatcta attgtagatg 99900 tccaattctt ctcatactac aaatcatttc caggaaggaa aagataggac ctttgaaaaa 99960 tctgatggat cggccatgtg tttttatcca ccatcactaa atgatgcatc ttttcctttg 100020 actggattca ataaatgtgt tgttttgaat cagtttcttt gaattgaaag aagccaagaa 100080 agacaaaaac atagatgctt tcattaaaag cataagtaca atgtattggc tggatggtgt 100140 tcattccgga ggaaggaata ctggagttac ttatccacaa gtcttgaaag aatttgcaca 100200 aacaagaatt attgttcaca cccatggaac actttaccaa gtatgtgatc taatgagatc 100260 ttgcattgga aaggagcaaa ataaattttt tagatacttg gggatattgg tatgctggtg 100320 agtagctgaa ttcatttcac gaaggaagct ccctacatag agaatccctt tcagaattca 100380 tgaagtgttt gagactacaa gtatattaat gtacttgttc agcggaagag cataagcact 100440 ttgagtgtta taaattcaga taatggaatg tacttcatag atgtattgtc agtttggggg 100500 tatggaggga agcacacatt cctgaaaaat gagtgtaatg tgc 100543 

That which is claimed is:
 1. An isolated peptide consisting of an amino acid sequence selected from the group consisting of: (a) an amino acid sequence shown in SEQ ID NO:2; (b) an amino acid sequence of an allelic variant of an amino acid sequence shown in SEQ ID NO:2, wherein said allelic variant is encoded by a nucleic acid molecule that hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS:1 or 3; (c) an amino acid sequence of an ortholog of an amino acid sequence shown in SEQ ID NO:2, wherein said ortholog is encoded by a nucleic acid molecule that hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS:1 or 3; and (d) a fragment of an amino acid sequence shown in SEQ ID NO:2, wherein said fragment comprises at least 10 contiguous amino acids.
 2. An isolated peptide comprising an amino acid sequence selected from the group consisting of: (a) an amino acid sequence shown in SEQ ID NO:2; (b) an amino acid sequence of an allelic variant of an amino acid sequence shown in SEQ ID NO:2, wherein said allelic variant is encoded by a nucleic acid molecule that hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS:1 or 3; (c) an amino acid sequence of an ortholog of an amino acid sequence shown in SEQ ID NO:2, wherein said ortholog is encoded by a nucleic acid molecule that hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS:1 or 3; and (d) a fragment of an amino acid sequence shown in SEQ ID NO:2, wherein said fragment comprises at least 10 contiguous amino acids.
 3. An isolated antibody that selectively binds to a peptide of claim
 2. 4. An isolated nucleic acid molecule consisting of a nucleotide sequence selected from the group consisting of: (a) a nucleotide sequence that encodes an amino acid sequence shown in SEQ ID NO:2; (b) a nucleotide sequence that encodes of an allelic variant of an amino acid sequence shown in SEQ ID NO:2, wherein said nucleotide sequence hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS:1 or 3; (c) a nucleotide sequence that encodes an ortholog of an amino acid sequence shown in SEQ ID NO:2, wherein said nucleotide sequence hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS:1 or 3; (d) a nucleotide sequence that encodes a fragment of an amino acid sequence shown in SEQ ID NO:2, wherein said fragment comprises at least 10 contiguous amino acids; and (e) a nucleotide sequence that is the complement of a nucleotide sequence of (a)-(d).
 5. An isolated nucleic acid molecule comprising a nucleotide sequence selected from the group consisting of: (a) a nucleotide sequence that encodes an amino acid sequence shown in SEQ ID NO:2; (b) a nucleotide sequence that encodes of an allelic variant of an amino acid sequence shown in SEQ ID NO:2, wherein said nucleotide sequence hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS:1 or 3; (c) a nucleotide sequence that encodes an ortholog of an amino acid sequence shown in SEQ ID NO:2, wherein said nucleotide sequence hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS:1 or 3; (d) a nucleotide sequence that encodes a fragment of an amino acid sequence shown in SEQ ID NO:2, wherein said fragment comprises at least 10 contiguous amino acids; and (e) a nucleotide sequence that is the complement of a nucleotide sequence of (a)-(d).
 6. A gene chip comprising a nucleic acid molecule of claim
 5. 7. A transgenic non-human animal comprising a nucleic acid molecule of claim
 5. 8. A nucleic acid vector comprising a nucleic acid molecule of claim
 5. 9. A host cell containing the vector of claim
 8. 10. A method for producing any of the peptides of claim 1 comprising introducing a nucleotide sequence encoding any of the amino acid sequences in (a)-(d) into a host cell, and culturing the host cell under conditions in which the peptides are expressed from the nucleotide sequence.
 11. A method for producing any of the peptides of claim 2 comprising introducing a nucleotide sequence encoding any of the amino acid sequences in (a)-(d) into a host cell, and culturing the host cell under conditions in which the peptides are expressed from the nucleotide sequence.
 12. A method for detecting the presence of any of the peptides of claim 2 in a sample, said method comprising contacting said sample with a detection agent that specifically allows detection of the presence of the peptide in the sample and then detecting the presence of the peptide.
 13. A method for detecting the presence of a nucleic acid molecule of claim 5 in a sample, said method comprising contacting the sample with an oligonucleotide that hybridizes to said nucleic acid molecule under stringent conditions and determining whether the oligonucleotide binds to said nucleic acid molecule in the sample.
 14. A method for identifying a modulator of a peptide of claim 2, said method comprising contacting said peptide with an agent and determining if said agent has modulated the function or activity of said peptide.
 15. The method of claim 14, wherein said agent is administered to a host cell comprising an expression vector that expresses said peptide.
 16. A method for identifying an agent that binds to any of the peptides of claim 2, said method comprising contacting the peptide with an agent and assaying the contacted mixture to determine whether a complex is formed with the agent bound to the peptide.
 17. A pharmaceutical composition comprising an agent identified by the method of claim 16 and a pharmaceutically acceptable carrier therefor.
 18. A method for treating a disease or condition mediated by a human secreted protein, said method comprising administering to a patient a pharmaceutically effective amount of an agent identified by the method of claim
 16. 19. A method for identifying a modulator of the expression of a peptide of claim 2, said method comprising contacting a cell expressing said peptide with an agent, and determining if said agent has modulated the expression of said peptide.
 20. An isolated human secreted peptide having an amino acid sequence that shares at least 70% homology with an amino acid sequence shown in SEQ ID NO:2.
 21. A peptide according to claim 20 that shares at least 90 percent homology with an amino acid sequence shown in SEQ ID NO:2.
 22. An isolated nucleic acid molecule encoding a human secreted peptide, said nucleic acid molecule sharing at least 80 percent homology with a nucleic acid molecule shown in SEQ ID NOS:1 or
 3. 23. A nucleic acid molecule according to claim 22 that shares at least 90 percent homology with a nucleic acid molecule shown in SEQ ID NOS:1 or
 3. 